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Michael J. Di Toro was born in Campobasso, 
Italy, on June 24, 1910 and came to the United 
States in 1916. He received the E.E. degree in 1931, 
the M.E.E. degree in 1933, and the D.E.K. degree in 
1946, all from the Polytechnic Institute of Brooklyn. 

In 1934, Dr. Di Toro joined the laboratories of the 
Thomas A. Edison Company in West Orange, N.J. 
where he engaged in the development of electro- 
acoustical and electromechanical devices used in re- 
cording and reproducing sound. In 1938, while in 
charge of acoustical development, he published the 
first analytical paper on phonograph tracing distor- 
tion. 

He became associated with the Hazeltine Electro- 
nics Corporation in 1941, and, as senior electrical en- 
gineer, was in charge of projects concerned with air- 
ground pulse time modulated telemetering, vhf wave 
meters, and the development and production of 
miniaturized large bandwidth-delay product video 
delay lines, for which he evolved the new technique 
of multilayer winding with self capacitance for the 
correction of phase distortion. 

In 1946 he joined the Microwave Research Insti- 
tute of Polytechnic Institute of Brooklyn, becoming 
assistant director in 1947. He supervised and partici- 
pated in projects on uhf power meters, fm transient 
distortion analysis, and nonlinear network research. 


Micwaet J. Di Toro, Jr. 


At present he is an adjunct professor in the graduate 
school of electrical engineering 

From 1947 to 1952 Dr. Di Toro was with Federal 
Telecommunications Laboratories where he initiated 
and supervised several projects for the armed forces. 
These included a new type scanning comb filter re- 
ceiver for passive sonar detection, speech compression 
and noise suppression systems, and reliable long- 
range pulsed ionospheric air-ground radio communi- 
cation system. 

During 1952-1953 he was in charge of a missile 
instrumentation project for A.B. DuMont Labora- 
tories in Passaic, N.J. In 1953 he accepted the posi- 
tion of head of the Electronics Development Divi- 
sion at Fairchild Guided Missile Division in Wyan- 
danch, L.I. He joined the Polytechnic Research and 
Development Co., Inc. in Brooklyn, N.Y., as chief 
electronics engineer in 1955, and became Director 
of Engineering in 1956. 

Dr. Di Toro was elected a Fellow of the American 
Acoustical Society in 1953. He is a senior member of 
the IRE, and chairman of the PGIT. He is also a 
member of the AIEE, the American Physical Society, 
the American Management Association, Sigma Xi, 
and Eta Kappa Nu, and is a registered professional 
engineer in New York. He has received twenty 
patents. 
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Applications of Information Theory 


MICHAEL J. DI TORO, JR. 
Chairman, Professional Group on Information Theory 


It is now almost four years since the appearance of the 
first IRE TRANSACTIONS ON INFORMATION THEORY, and even 
longer since the pioneering work in this field of Wiener, 
Shannon, Fano, Woodward, and others. A large number of 
papers have been published and already six symposia of 
international scope have been held in London, Boston, and 
New York. Besides the direct communication, radar and 
data handling applications, other diverse fields have been 
considered such as machine translation, reliable computer 
operation, automatic speech recognition, pattern learning and 
recognition, theory of hearing, management, and many others. 
It cannot be doubted that, with the continual expenditure 
of sufficient funds and manpower, the cross-fertilization 
arising from all this activity should increase the rate of con- 
ception of new ideas. 

But conception is wonderful—then comes labor. Has not 
the time been reached when it is necessary to take stock of 
progress in knowledge since the beginning of the era of 
Information Theory? Should not we determine now how 
such learning may be applied, and with what expected pay- 
off, in the design of economically justifiable equipment in 
the wide fields of interest of the other twenty-three profes- 
sional groups of the IRE? 

An example of what has been done without aid from 
Information Theory is the discovery of pulse code modu- 
lation by A. H. Reeves in 1939, representing a major break- 
through in a method of coding. It is quite remarkable that 
pem comes within about 8 db of signal/noise (for a prac- 
tically small error probability) to the ideal channel capacity 
given by Shannon’s now famous law. On the other hand, 
even with all of the available body of theory, consider the 
rather meager results relative to expenditure of funds and 
effort obtained in some of the bizarre schemes proposed to 
conserve bandwidth in speech and tv or fax signals, when 
viewed in the sobering light of ecomonic jutification of the 
ensuing equipment. 

It is obviously necessary to encourage the application of 
the teachings of Information Theory by engineers in the 
other IRE professional groups. This can be aided consider- 
ably by PGIT through publication of original and higher 


2) 


caliber papers which can be written in the beautiful and 
practical style exemplified by J. R. Carson’s papers. Thus, 
it would seem desirable that the problem considered by an 
author should be stated preferably in nonmathematical lan- 
guage, and followed by a review of prior work and its limi- 
tations. The new results of the paper may then be given 
together with some typical nontrivial applications, and finally, 
in an appendix which need not be read nor understood by 
the reader in order for him to apply the results of the paper, 
the usual pyrotechnical display of the author’s mathematical 
prowess may be given. 

Of particular importance are papers stating basic laws 
of Information Theory which set theoretical bounds on per- 
formance beyond which one cannot go and therefore ahould 
not spend money and effort attempting to do so. Examples 
are Shannon’s channel capacity law, Wiener’s irreducible 
residual error in least-mean-square-error linear filtering, 
Gabor’s uncertainty principle between frequency uncer- 
tainty and time of occurrence uncertainty, and in the related 
field of circuit theory, the Paley-Wiener theorem regarding 
the nonphysical realizability of such structures as the ideal 
filter with rectangular amplitude and linear phase responses. 
These are the important tools by which, in the absence of 
a unified theory of system synthesis, one can be guided in 
the design and evaluation of suggested systems. 

Some of the future channels of research in Information 
Theory which need further investigation have been recently 
discussed briefly by Fano in a May, 1956 report for the Tech- 
nical Advisory Panel on Electronics, Office of the Assistant 
Secretary of Defense. It is urged that interested readers who 
are in a position to stimulate activity in the indicated areas 
should do so. 

Moreover it is also stressed that consideration be given to 
the man-machine matching problem, wherein the machine 
enhances, rather than replaces, man’s ear-brain, eye-brain, and 
his other highly developed functions which cannot now be 
satisfactorily replaced by the machine. 

Again, as in previous editorials, I should like to encourage 
PGIT members to air their views (e.g., take exceptions to 
the foregoing) in our PGIT Correspondence section. 
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In addition to the scheduled program, the following two papers, by A. N. Kolmogorov 
and V. L. Siforov, were presented at the 1956 Symposium on Information Theory. However, 
the manuscripts were received too late for inclusion in the September (Symposium) issue 
of these TRaNsActions. The papers were submitted in response to our invitation to these 
distinguished Russian scientists, and the following translations were distributed to those 
attending the Symposium.—The Editor. 


On the Shannon Theory of Information Transmission 
in the Case of Continuous Signals’ 


ANDREI N. KOLMOGOROVt 


I. InrRoDUCTION 


HE ROLE of the entropy of a random object &, 
capable of taking the values 2,, %, ---, % With 
the probabilities p,, po, +++, Dn; 


H(é) = — 2! Ps log px 


) 


in information theory and in the theory of information 
transmission using discrete signals, can be considered to 
have been explained sufficiently. Furthermore, I insist 
that the fundamental concept, which admits of general- 
ization to perfectly arbitrary continuous information and 
signals, is not directly the entropy concept but the concept 
of the quantity of information /(é, 7) in the random 
object — relative to the object 7. In the discrete case this 
quantity is evaluated correctly according to the well- 
known Shannon formula:’ 


I, n) = H(n) — MH(n/é). 


For a finite-dimensional distribution, possessing density, 
the quantity 7(é, 7) is determined, according to Shannon, 
by the analogous formula 


LE, 9) = h(n) — Mh(n/8), 


where h(n) is the “differential entropy” 


h(n) = — if ply) log ply) dy, 


and h(n/é) is the conditional differential entropy defined 
in an analogous manner. It is well known that the quantity 
h(é) has no direct real interpretation and is not even 
invariant with respect to coordinate transformation in 
the space of the z’s. For an infinitely-dimensional distri- 
bution, the analog of h(é) is nonexistent, in general. 
According to the proper meaning of the word, the 
entropy of the object — with a continuous distribution is 


* Presented at 1956 Symposium on Information Theory at 
Mass. Inst. Tech., Cambridge, Mass., September 10-12, 1956. 
Translated by Morris D. Friedman. 

+ Academician, Academy of Science, USSR. 

1 Tt seems expedient to me that the notation H(n/z) is the con- 
ditional entropy of 7 for £ = x and MH(7/é) is the mathematical 
expectation of this conditional entropy for the variable £. 


always infinite. If the continuous signals can, nevertheless, 
serve to transmit finitely great information, then it is 
only because they are always observed with bounded 
accuracy. Consequently, it is natural to define the ap- 
propriate “‘e-entropy” H.(&) of the object é by giving the 
accuracy of observation e. Shannon did thus under the 
designation “rate of creating information with respect to 
a fidelity criteron.’’ Although choosing a new name for 
this quantity does not alter the situation, I decided to 
call your attention to that proposition by underlining the 
more widespread interest in the concept and its deep 
analogy to the ordinary exact entropy. I imply, before- 
hand, that, as remarked in Section IV, the theorem on 
the extremal role of the normal distribution (both in the 
finite-dimensional and the infinite-dimensional cases) is 
retained for the « entropy. Furthermore, I give in Sections 
II and III, without pretending to its unconditional new- 
ness, an abstract formulation of the definition and 
fundamental properties of /(é, 7) and a survey of the 
fundamental problems of the Shannon theory of infor- 
mation transmission. Certain specific results obtained 
recently by Soviet investigators are explained in Sections 
IV to VI. I wish to emphasize, especially, the very 
significant interest, as it appears to me, in the investi- 
gations of the asymptotic behavior of the e entropy as 
e — 0. The cases investigated earlier 
H()~nlog=; H,@) = 2 

where n is the number of measurements and w is the 
bandwidth of the spectrum, are only very particular 
cases of the rules which can be encountered here. In order 
to understand the perspectives disclosed here, my note, 
explained in another terminology, might be of interest 
hence, I am placing a certain number of reprints at the 
disposal of the participants of the symposium. 

To a considerable degree, my report reproduces the 
contents of a report presented jointly with Iaglom anc 
Gel’fand at the Third All-Union Mathematics Conference 


2A, N. Kolmogorov, Doklady, AN USSR, vol. 108, no. 3, pp 
385-388; 1956. 
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lowever, since the present symposium is of a more 
gineering character, I omitted a number of mathematical 
etails. The work of Khinchin on the logical foundations 
f the theory remains beyond the limits of my survey. 
As regards the work of Soviet radio engineers, you will 
ear about some of them from the other speakers. In the 
ote itself, I will have occasion to note only the interest, 
1 principle, of certain early work of Kotel’nikov, circa 
933 (see Section VI, further). 


IJ. Quantity or INFORMATION IN ONE RANDOM 
OpsEct RELATIVE TO ANOTHER 


Let ~ and 7 be random objects with regions of possible 
alues X and Y, 


Ria bikers), PAB)= Py ep) 


he appropriate probability distributions, and 
P,(C) = P(E, ) e C) 


he joint probability distribution of the objects & and 7. 
sy definition, the quantity of information in the random 
bject € relative to the random object 7 is given by the 
ormula 


I(é, ) = i [ P;,(dx dy) log eee (1) 


‘he exact meaning of this formula requires certain eluci- 
ation and the general properties of /(&, 7), given later, 
re correct only for certain limitations of a set-theoretical 
haracter on the distributions P;, P, and P;,, but I will 
ot dwell on this here. In every case, the general theory 
an be explained, without great difficulty, in such a way 
hat it will be applicable to random objects & and 7 of 
ery general nature (vectors, functions, generalized 
ictions, ete.). 

Eq. (1) can be considered to be due to Shannon although 
e was limited to the case 


P(A) = f pila) dv; PB) = [ re) dy 


PoC) = ff pede, y) de dy 


Then (1) transforms into 


Lig AC 
IE, n) = [ I Pa, y) ine dx dy. 


ometimes, it is useful to represent the distribution P as 


P.(C) = ff ale, WPedx)P,y) + 810) @) 


here the function S(C) is singular relative to the product 
PCP: 
f the singular component of S is lacking, then the formula 


(3) 


Aen = até, n) 
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determines the random quantity a@;, uniquely to the 
accuracy of probability zero. Sometimes, the following 
theorem formulated by Gel’fand and Iaglom? is useful. 

Theorem: Tf SQ x) Y) => 0, thengl (é--y)r =o at 
Ste) a= Othen 


1¢, 2) = | fate, 9) log ale, )P(aa)P,(Ay) 


= / / log a(a, y)Pe,(dx dy) 
xX XY 


== Yl log tn. (4) 


Let us enumerate certain fundamental properties of 


T(E, 7). 


Del (Sin aa ine) 
2) I(é, n) => 0; T(E, n) = 0 only if € and 7 are independ- 
ent. 


3) If the pair (é,, 7.) and (&, 7.) are independent, then 


Is, £2), (m, ne) | = Té, m) ail I(&., 2) « 


A) AINE, 0), 62 ea 

5) I[(é, 2), §] = L(n, §), if and only if é, 9, ¢ isa Markov 
sequence, 7.e., if the conditional distribution of ¢ 
depends only on 7 for fixed ~ and 7. 


Apropos property 4, it is useful to note the following. 
In the case of the entropy 


He\i=s T(E 7); 


there is the bound on the entropy of the (&, 7) pair from 
above: 


H(é, ») < H(g) + H(n) 
as well as the bound from below which results from | and 4. 
Hé,) 2 B®); i Gen) eae 


A similar estimate for the quantity of information in ¢ 
relative to the (é, 7) pair does not exist. From 


OCS EG hes Uy Mell ie Km 0 
there still does not result the equality 
Tl, nm), §] ae 0, 


as can be shown by elementary examples. 
For later use, let us note the special case when & and 
are the random vectors: 


Eo (Sayer ye) 
n = (Mm) *°* , te) = Enary +72 tbmen)s 
and the quantities 
£1, fo, ++ y Emin 


are distributed normally with the second central moments 
8:3 = ME; — Me); — ME). 
3A. N. Kolmogorov, A. M. Iaglom, and I. M. Gel’fand, 


“Quantity of Information and Entropy for Continuous Distribu- 
tions,” Report at Third All-Union Math. Conf., 1956. 
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If the determinant 


C= 


Si; liege y een 


is not zero, then, as was calculated by Gel’fand and 
Iaglom 


1 AB 


I, 9) = 5 log = (5) 


where 


B 


L — . 
A = | 85; hop pens | si; eaarea Pees: 


It is often more expedient, however, to use another 
approach without the C > 0 limitation. As is known,* all 
the second moments s;; except those for which 7 = j or 
j = m+ 7% go to zero after a suitable linear coordinate 
transformation in the X and Y spaces. For such a choice 
of coordinates 


I(é, ») = —4 Ss [ies 1 (Ex ne) | 


where the summation is taken over those 


(6) 


k < min (m, n) 


for which the denominator in the expression of the 
correlation coefficient 


Sk m+k 
WV Sup *Smthk.m+k 


rE, m) = 
is not zero. 


III. Assrract EXPLANATION OF THE PRINCIPLES OF THE 
SHANNON THEORY 


Shannon considers the transmission of information 
according to the scheme 


SS ee 
where the ‘transmitting apparatus” 
1 
is characterized by the conditional distribution 
P,»/,(B’/y) = P(r’ & B’/y = y) 


of the “output signal’ 7’ for a given ‘input signal” 7 
and a certain limitation 


doe Ve 

of the input signal distribution P,. The ‘‘coding”’ 
f= 

and “decoding” operations 
Maree 

are characterized by the conditional distributions 

P,,(B/2) = P(ne B/t = 2) 
Pej lA’ /y’) = PE’ e Al/y = ¥'). 


4A, M. Obukhoy, Jzv. AN USSR, Phys.-Math. Series, pp. 
339-370; 1938. 
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The fundamental Shannon problem is the following. 
Given the spaces X, X’, Y, Y’ of possible values of the 
“input message” £ the ‘output message’ £, the input 
signal 7, and the output signal 7’; given the characteristics 
of the transmitter, 7.e., the conditional distribution P,,/, 
and the class V of admissible input signal distributions 
P,; finally, given the distribution 


P(A) = EE eA) 
of the input message and the “fidelity criterion” 
Pipe W 
where W is a certain class of joint distributions 
PrlC) = Pl, Ee C] 


of the input and output communications. To find: Is it 
possible, and if it is, by what means, to give a coding and 
decoding rule (7.e., the conditional distributions P,,; and 
P:-,,») 1 such a manner that by calculating the distri- 
bution P;; in terms of the distributions P:, Py/:, Py/_y 
P:-,, under the assumption that the sequence 


Ene a ae 


is Markovian, we will obtain 
Pree, £ W? 


As does Shannon, so let us define the 


“capacity’”’ of the 
transmitter thus | 


C = sup I(n, 7’) 


Prev 


and let us introduce the quantity 
Hy(t) = inf I, 2) 
Prrerew 


which Shannon calls the “rate of creating information 
relative to a fidelity criterion”? when computed per unit 
time. Then, the necessary condition of the possibility oy 
transmission 


Hy) <C 7) 


results at once from property 5 of Section IT. 

The incomparably deep idea of Shannon is that (7). 
when applied to the continuous operation of a “communi. 
cation channel,” is ‘‘almost sufficient” in a certain sense 
and under certain very broad conditions. From the math- 
ematical point of view, it is a matter here of proving ¢ 
limit theorem of the following type. It is assumed that the 
space X, X’, Y, Y’ of the distributions P; and P,-,, of the 
classes V and W, and therefore, of the quantities C’ anc 
Hw (é), depend on the parameter 7 (which plays the role 
in applications, of the duration of transmitter operation) 
It is required to establish that the condition 


(or 
Hy (€) 
is sufficient, under a certain sufficiently general characte 


of the assumptions, for the possibility of transmission 
satisfying the conditions formulated above, for sufficiently 


lim inf al 


To 0 


(8 


| 
| 
| 
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large 7’. Naturally, in such a formulation the problem is 
somewhat indistinct (for example, similar to the general 
problem of studying possible limit distributions for a sum 
of large numbers of “small” components). However, I 
intended to avoid any return to the terminology of the 
theory of stationary, random processes here, since it was 
shown in the note of the young Romanian mathematician 
Rozenblat-Rot Milu’ that interesting results can be 
obtained in the designated direction without the as- 
sumption of stationariness. 

Many remarkable works have been devoted to the 
derivation of a limit theorem of the kind indicated. The 
work of Khinchin® is the contribution of a USSR math- 
ematician in this research direction. It appears to me that 
much remains to be done here. Namely, results of this 
kind are intended to give a foundation to the widespread 
conviction that the expression 7(£, 7) is not just one of the 
possible methods of measuring the ‘‘quantity of infor- 
mation’ but it is a measure of the quantity of information 
having an advantage, in principle, over the others, actually. 
Since the “information,” by its original nature, is not a 
scalar, then axiomatic investigations, permitting [(€, 7) 
lor the entropy H(é)] to be characterized uniquely by 
using simple formal properties, in this respect have lesser 
value, in my opinion. The situation here seems to me to 
be similar to our being ready to assign, at once, the 
greatest value to that method, out of all those proposed 
by Gauss to give a foundation to the normal law of error 
distribution, which starts from the limit theorem for the 
sum of a large number of small components. Other methods 
(for example, based on the principle of the arithmetic 
mean) demonstrate only why any other error distribution 
law could not be as acceptable and suitable as the normal 
law but they do not answer the question of why the 
normal law is actually encountered often in real problems. 
Similarly, the beautiful formal properties of the expressions 
H(é) and J(é, ) cannot demonstrate why they are suf- 
ficient for the complete (albeit from the asymptotic point 
of view) solution of many problems in many cases. 


IV. CALCULATION AND ESTIMATION OF THE e€ ENTROPY 
IN CerTaIn PARTICULAR CASES 


If the condition 


Pye W 
is chosen as the certainty of exact coincidence of é and &’ 
EA a aad 
then 
Hy(é) = H&). 


In conformance with this, it seems to be natural to 
designate Hy(é) in the general case as the “entropy of 
the random object £ for the accuracy of reproduction W.” 


5 Rozenblat-Rot Milu, Vrudy, Third All-Union Math. Conf., 
vol. 2, pp. 132-133; 1956. 

6A. Ia. Khinchin, Usp. Mate. Nauk, vol. 11, no. 1 (67), pp. 
17-75; 1956. 
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Now, let us assume that X is a metric space and that 
the space X’ coincides with X, 7.e., methods are investi- 
gated of the approximate transmission of information 
from the point — e X by using the indication of the point 
£’ of the same space X. It seems natural to require that 


P(e 2) Se =1 (WO 
or that 


Mig ine) <2 eas aan (Vie 


We will denote these two forms of the ‘‘e entropy” of the 
distribution P; by 


Hy (f) = HE 
Hy (€) = Ht €). 


As regards the e entropy H?, I shall only note here a 
certain estimate for 


OG ee sup AX(é) 


where the upper bound is taken over all the probability 
distributions P; in the space X. As is known, for e = 0, 


Hy(X) = sup H&) = log N, 
Py 


where N, 7s the number of elements of the manifold X. For 
ess 


log Ni(2e) < Hi(x) < log Nie) 


where Ni(e) and N¢(e) are characteristics of the space X 
which are introduced in my note.” The asymptotic 
properties of the function N,(e) as e — 0, studied in my 
work’ for a number of specific spaces X, are interesting 
analogs of the properties, explained later, of the asymptotic 
behavior of the function H,(é). 

Let us now turn to the e entropy H,(&). If X is an n- 
dimensional Euclidean space and if 

uy i Pi dee 
A 

then, at least in the case of the sufficiently smooth function 
p(x), the following well-known formula holds: 


H(@) = n log = + [h@) — n log V2xe] + (1) @) 


where 


ni) = — | pela) log pela) dey --- de, 
x 

is the “differential entropy,’ already introduced in the 
first Shannon works. Hence, the asymptotic behavior of 
H.(é) in the case of sufficiently smooth continuous dis- 
tributions in n-dimensional space is determined, fo a first 
approximation, by the dimensionality of the space and the 
differential entropy h(£) only enters as the second term in 
the expression for H,(é). 

It is natural to expect that the growth of H.(£) as e— 0 
will be substantially more rapid for typical distributions 
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in infinite-dimensional spaces. As the simplest example, 
let us consider the Wiener random function &(¢), defined 
for 0 < ¢ < 1, with the normally-distributed independent 
increments: 
AE = g(t + At) — Ed) 
for which 
£0) = 0; 


MAé = 0; M(Aty = At. 


Iaglom found that in this case, in the L* metric 


4] il 
Hi =4) 4 05) (10) 
7 € 
Under certain natural assumptions, the formula 
4 /1 1 
4 =*,(4) + (3) iy 
Tv € € 


where 


MiGs) = ie Mb | t, £(4) | dt 


can be obtained in a more general way for the diffuse 
kind of Markov process on the ft, < ¢ < ¢, time segment 
with 


M ae — 
M(As) = 


A[t, &)] At + o(Ad); 
Blt, )] At + o(At). 

The e-entropy H, can be calculated exactly for the case 
of the normal distribution in an n-dimensional space or in 
Hilbert space. After a suitable orthogonal coordinate 


transformation, the n-dimensional vector £ assumes the 
form 


g = (£1, &2, ae ea) 


where the coordinates £ are mutually independent and 
distributed normally. The parameter 6, for given e¢, is 
determined from the equation 


& = D7 min (6, Dé) 


and in the case of é distributed normally 


HA) = 3 Oto BE, (12) 
The approximating vector 
Ss SS) ans) 
should be chosen such that 
f = 0 
for Dz, < 6° and 
SS es DA, = 0; D’&, = Dé, — & 


for D’é, > 6 and the vectors £ and A, are mutually 
independent. The infinite-dimensional case is in no way 
different from the finite dimensional. 

Finally, it is very essential that the maximum value of 
H ,(&) for the vector — (n dimensional or infinite dimensional) 
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be attained in the normal distribution case for given second 
central moments. This result can be obtained directly or 
from the following proposition of Pinsker.’ 

Theorem: Let the positive-definite symmetric matrix 
of the s,;,0 < 7,7 < m+n quantities and the distribution 
P, of the vector be given 


a (E,, &, or. >) 
for which the central second moments equal s,,; (for 
0 < 7,7 < m). Let the condition W on the joint distri- 
bution P;;, of the vector — and the vector 


36 a (En+1) pe by ee SE nage) 


be that the central second moments of the quantities 


ieee ae paee 
equal’s;; (for 0 (57,7 an) eben 
i AB 


Hy(€) S 5 log a (13) 

The notation in (13) corresponds to the explanation of 
Section II. It is seen from a comparison with the results 
of Section II, that inequality (13) becomes the equality 
in the case of the normal distribution P:. 

The principles of solving the variational problems 
arising in the calculation of the “rate of creating infor- 
mation” were indicated sufficiently long ago by Shannon. 
Shannon and Weaver’ write: ‘“Unfortunately these formal 
solutions are difficult to evaluate in particular cases and 
seem to be of little value.’’”® In substance, however, many 
problems of this kind are simple enough, as is seen from 
the above. It is possible that the slow development of 
investigations in this direction is related to insufficient 
understanding of the fact that the solution of the variation 
problem often appears to be degenerate in typical cases: 
For example, the evaluation of H.(&) in the problem 
selected above, for the normally distributed vector € in 
the n-dimensional case, the vector & often appears to be 
not n dimensional but only k dimensional with k < n; 
in the infinite-dimensional case, the vector & always 
appears to be finite dimensional. 


V. Quantity oF INFORMATION AND RATE OF CREATING 
INFORMATION IN THE STATIONARY PRocESS CASE 


Let us consider two stationary and stationarily-related 
processes, 


E(t), n(t) 


Let us denote by &7 and yn, the segments of the € and 7 
processes in the time 0 < ¢ < JT and by &_ and y_ the flow 
of the € and 7 processes on the negative semiaxis —«o < 
t < 0. To give the pair (&, ) of the stationarily-related & 
and 7 processes means to give the probability distribution 


OO) <A ie f= Or, 


7™M. S. Pinsker, Trudy, Third All-Union Math. Conf., vol. 1, p. 
125; 1956. 

§C. E. Shannon and W. Weaver, “The Mathematical Theory 
of Communication,’”’ University of Illinois Press; 1949. 

9 [bid., sec. 28, p. 79 of the Russian translation. 
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P.,, invariant to shift along the ¢ axis, in the space of the 
function pair }x(t), y(¢)}. If & is fixed, then the following 
conditional probability 


Pennse(C/a_) = P{(Er, n) © C/E. = 2} 


arises from the distribution P;,. Using this distribution, 
the conditional quantity of information 


T(Er, n/x) 


is calculated in conformance with Section II. If the 
mathematical expectation 


MI(Er, /&-) 
is finite for any 7’ > 0, then it is finite for all other 7 > 0 
and 


MI (ér, n/t-) = THE, 0). 


It is natural to call the quantity I(, 7) the “rate of creat- 
ing information of the process » for compliance with the 
process &.” If the process ~ can be extrapolated with 
complete accuracy to future occurrences, then 


I(, 1) = 0. 


In particular, this will be so if the process é has a bounded 
spectrum. Generally speaking, the following equality 


1, 1) = I(n, 8) (14) 


does not hold. However, under sufficiently broad conditions 
on the “regularity” of the process £,"° the equality 


1, n) = T(E, n) 


holds, where 
= ape 
TI, n) = lim T I(Er, nr) 
To30 


Since I(r, nr) = I (nz, Er), then always 


T(E, ) = I(n, 8) 


and, therefore, when both equalities T(é, 1) = T(é, n) and 
I(n, £) = I(n, &) are correct, equality (14) holds. Now, 
let W be a certain class of joint distributions P;;, of two 
stationary and stationarily-related processes & and &’. 
It is natural to call the equantity 
Hy) = inf 1,8) 

Perevew 
the “rate of creating information in the process £ under the 
accuracy of reproduction W.”’ It can be shown that 


H w(é) a Hy® 
where 


inf T(E, £’) 


Peg ew 


Hy = 


10 Here and later, the regularity of the process means, roughly 
speaking, that the segments of the process, corresponding to two 
segments of the ¢ axis sufficiently removed from each other, are 
almost independent. In the case of Gaussian processes, the well- 
known definition of regularity introduced in my work" is applicable 
here. 
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under certain assumptions on the regularity of the process 
£ and for certain natural types of conditions W. 


VI. CaLcuLATION AND ESTIMATION OF THE AMOUNT OF 
INFORMATION AND THE RATE OF CREATING 
INFORMATION IN TERMS OF THE SPECTRUM 


Pinsker"™ established the formula: 


1&1) =z f lost - ro) 3) 
where 
A HONE 
ra Fee) far) 


and fee, fer, fon are spectral densities; for the case when 
the distribution P;, is normal and at least one of the pro- 
cesses € or 7 is regular. In connection with the review by 
Doob,’” we would like to note that the novelty, in princi- 
ple, of the Pinsker result is somewhat greater than can be 
expected on the basis of this review. The expression 


h(é) = log (2r-V’e) + i ip S log fee(A) dA (16) 


is known in the case of processes with discrete time ¢ for 
the differential entropy of a normal process per unit time: 


f(g) = lim & WG, Bs) «+ 5 En). 


To0 
However, no analog of the expression h(£) exists in the 
continuous time and unbounded spectrum case and the 
Pinsker formula requires independent derivation. 
It is natural to characterize the accuracy of reproducing 
the stationary process &, using the stationary process ¢ 
stationarily related to &, by the quantity 


o = M{k(@) — #’(@)/ 
and in the case of a W condition of the form 
herons 
it is natural to call the quantity 
H.@) = Hy) 


the € entropy per unit time of the process é and under the 
assumption that 


Hy (8) me Hy 


the rate of creating information in the process é for average 
accuracy of transmission ¢. It can be concluded from the 
appropriate statement for finite-dimensional distributions 
(see Section IV) that the quantity H.(&) attains a maxi- 
mum in the case of the normal process é for a given spectral 
density f¢:(\). In the normal case, H,(é) can be calculated 
easily in terms of the spectral density exactly as was 
explained in Section IV applied to H .(€) for the n-di- 


11M. S. Pinsker, Doklady, AN USSR, vol. 98, 213-216; 1954. 
2 J. L. Doob, Math. Revs., vol. 16, p. 495; 1955. 
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mensional distribution. 
from the equation 


The parameter @ is determined 


2 


a ae fh min [6°, fz:(A)] da. (17) 


Using this parameter, the quantity H.(é) is found from 


the formula 


H() = fee (18) 
; fee(dye Y 
Spectral densities of the kind shown in Fig. 1 which 
are approximated well by the function: 
a fo A<|A|<A+W 
ree || 
lo in the remaining cases 
are of practical interest. It is easy to calculate that 
ce 
 ~ 
2W (19) 


H.(é) ~ W log a 


approximately, in this case for not too small ¢ for a normal 
process. 


A? 


Fig. 1 


Certainly, (19) is none other than the well-known 
Shannon formula 


R = W log £. (20) 
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Here, however, the novelty, in principle, is that now w 
see why and within what limits (for not too small e) thi 
formula can be used for a process with an unboundet 
spectrum and such are all the processes in the theory o 
information transmission which really interest us. 
Writing (19) thus 

H(t) ~ 2W log (aV/2W) log : (21 
and comparing with (9), we see that the double width 2) 
of the useful frequency band plays the role of the numbe 
of measurements. This idea of the equivalence of twice 
the frequency bandwidth to the number of measurements 
occurring in a certain sense of the word, per unit time 
was apparently first expressed by Kotel’nikov."* On th 
basis of this idea, Kotel’nikov indicated the fact that ; 
function, for which the spectrum is limited to bandwidtl 
2W, is determined uniquely by the values of the functior 
at the points 

2 1 1 2 k 


"9 ow? Tope 


OW’2W’ aes OW? 
Shannon retained this argumentation, using the repre 
sentation obtained in this manner to derive (20). Since | 
function with a bounded spectrum is always singular i 
the sense of my work'* and the observation of such | 
function is not related, generally, to the stationary floy 
of new information, then the sense of this kind of argumen 
tation does not remain completely clear so that the ney 
derivation of the approximate formula (21), cited here 
seems to me to be not devoid of interest. 

The growth of H,(é) as e decreases occurs, for small 
for any normally distributed regular random functior 
substantially more rapidly than would be obtained ac 
cording to (21). In particular, if f;:(\) has order 1/\6 a 
\— o, then h.(¢) has order 1/(2/.(@ — 1)). | 

1 V. A. Kotel’nikov, Material for the First All-Union Conf. c 
questions of communications; 1933. 


4A. N. Kolmogorov, Bulletin, Moscow Univ. I, no. 6; 194 
English translation available. 
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On Noise Stability of a System with Error-Correcting 
Codes’ 


VLADIMIR I. SIFOROVt 


Summary—The problem posed in this paper is to give a relation 
connecting the noise stability of a communication system in which 
error-correcting codes are used to the parameters of these codes. 
The amount x of code combinations is found, which differ from 
each other by not less than a given number of elements for a given 
arrangement of all the possible combinations. It is proved that the 
quantity x depends on the arrangement of the primary combinations. 
Inequalities are obtained for the greatest amount of combinations 
Xm) for which any two differ by not less than d elements for small 
and large values n of the common number of elements in each code 
combination. It is established that the probability of distortion, 
pn, of a code group in a system with correcting codes satisfies the 
inequality p, < f(p, n, d), where p is the probability of distortion 
of one element. The shape of the f(p, n, d) function is found for 
large values of n. Graphs of this function are constructed. 


INTRODUCTION 


NE OF THE means of guaranteeing the high 
() noise-stability and capacity of a communication 

system is to use error-correcting codes. The 
usual principle for constructing these codes is that ad- 
ditional symbols, intended to disclose and to correct 
the errors caused by noise, are added to code combinations 
of the usual kind. 

Shannon’ showed that if any 2° combinations are 
chosen at random from all 2” possible combinations of 
n-elementary dyadic signals, then the frequency of the 
error, caused by the noise, can be made as small as desired 
for C < C, and sufficiently large values of . In order to 
guarantee the greatest system capacity for the n value 
selected and for the sufficiently small error frequency 
given, the choice of the coding combinations must be 
made in such a way that they are as different as possible 
from each other. 

Correcting codes based on the use of only parts of combi- 
nations of elementary dyadic signals from all possible 
combinations, were investigated in a number of works, 
among which, for example, are those of Hamming,” 
Laemmel,’ Reed,* and Silverman and Basler.” 


* Presented at 1956 Symposium on Information Theory at Mass. 
Inst. Tech., Cambridge, Mass., September 10-12, 1956. 

+ Corresponding member, Academy of Science, USSR. 

1C, E. Shannon, ‘‘A mathematical theory of communication,” 
Bell Sys. Tech. J., vol. 27, p. 349; 1948. 

2R. W. Hamming, “Error detecting and error correcting codes,” 
Bell Sys. Tech. J., vol. 29, pp. 147-160; 1950. 

3A. BE. Laemmel, ‘‘Hfficiency of Noise-Reducing Codes’’ and 
“Communication Theory,” papers read at a symposium on “‘Appli- 
cations of Communication Theory” held at [EE, London, September 
22-26, 1952; also Butterworths Sci. Publ., pp. 111-118; 1953. 

4Reed, ‘A Class of Multiple-Error-Correcting Codes and the 
Detecting Scheme, M.I.T. Lincoln Lab. Tech. Rep. no. 44; 1953. 

5R. A. Silverman and M. Balser, ‘“‘Coding for constant-data-rate 
systems-part I, a new error-correcting code,’”’ Proc. IRE, vol. 42, 
pp. 1428-1435; September, 1954, and “part I1—multiple-error- 
correcting codes,” Proc. IRE, vol. 43, pp. 728-733; June, 1955. 


However, relations connecting the noise stability of 
the transmission system, in which the correcting codes 
are used, to the parameters of these codes, are not found 
in general-enough form in these and other works known 
to the author. The search for such general relations is a 
very difficult problem which must be solved by investi- 
gating a number of questions from the region of multi- 
dimensional geometry and number theory. An attempt 
is made in this paper to give a partial solution of the 
problem mentioned. 


On THE NUMBER OF CODE COMBINATIONS DrrrERING 
FROM HacuH OTHER By Not Less THAN A GIVEN 
NuMBER OF ELEMENTS (SMALL n Case) 


Let us consider the code combinations of n elements, 
each of which has the value 0 or 1. The common number 
of all possible combinations of this kind will be 2”. From 
these, let us select only those which satisfy the condition 
that any two of the selected combinations will differ from 
each other by not less than d elements. Let us call the 
quantity d the distance between the combinations. Let 
us denote by 2,, the greatest possible number of combi- 
nations selected which satisfy the condition mentioned 
above. 

Therefore, x,, is the greatest possible number of combi- 
nations for which any two are separated by a distance 
equal to or greater than d. Evidently, z,, is a function of 
n and d, 7.e., 


m = f(n, d). (1) 


Let us explain the definition cited here by a specific 
example. Let n = 3 and d = 2. The set of all possible 
combinations will be: 000, 100, 010, 110, 001, 101, O11, 111. 
The distances between any two of these eight combinations 
are within the limits from 1 to 3. Discarding the combi- 
nations separated by the unit distance, we easily confirm 
that there are only four combinations, satisfying the 
condition presented, 7.e., 


Ln = 1,2) 4. 


For example, these combinations are: 000, 110, 101, and 
011. Any two of these differ from each other by not less 
than two elements. 

Any two of the selected set of combinations differ from 
each other by not less than three elements for d = 3, 
n > 3. When such a set is used to transmit information, 
the distortion caused by noise can be corrected auto- 
matically in any one element. Actually, the distorted 
combination will differ for d = 3 from the true by one 


110 


element and from any other, by two or more elements. 
In other words, the distorted combination will be closer 
to the real for a single error than to the false. In principle, 
this affords the possibility of correcting single errors 
automatically. 

Similarly, it appears to be possible, for d = 5, to correct 
the errors in two elements automatically, for d = 7, in 
three and, in general, for any odd d in (d — 1)/2 elements. 
As is easily verified, automatic correction of (d/2 — 1) 
elements is possible for even values of d. For example, for 
d = 4, correction is possible of only a single error and for 
d = 8, correction of errors in three elements. 

The problem of searching for the z,, = f(n, d) function 
can easily be interpreted geometrically as the problem 
of finding the greatest number of vertices of an n-di- 
mensional ‘‘cube’’ for which any two differ from each 
other by not less than d coordinates. 

Shown in Fig. 1 is a three-dimensional cube (n = 38) 
and also the four vertices found above (000, 110, 101, 
and 011) for which any two differ from each other by not 
less than two coordinates (d = 2). 


(orl) 
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The number of code combinations which differ from 
each other in not less than given number of elements can 
be found for small values of n and d by direct computation. 

Let us assume that all 2” possible combinations are 
arranged in a series and that each has an index number. 
Let us compare each of the combinations, starting with 
the second and ending with the 2”-th, with the first. Let 
us cancel those which differ from the first in less than d 
elements. Now, let us enumerate the remaining combi- 
nations. Evidently, the greatest index number here will 
be less than 2”. Now, let us compare each of the remaining 
combinations, starting with the third and ending with the 
last, with the second. Let us cancel those which differ 
from the second in less than d elements. Let us enumerate 
the remaining combinations and let us compare each of 
them, starting with the fourth and ending with the last, 
with the third. Let us continue this procedure further in 
a similar manner and let us end it by comparing the last 
of the remaining combinations with the preceding. 

Let us denote by x the number of all combinations 
obtained as a result of the procedure described for given 
values of n and d. Evidently 


t= ed). (2) 
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In general « depends not only on n and d but on the 
initially-selected arrangement of all 2” primary combi 
nations. The z,, quantity which we introduced before ir 
(1) is the greatest value of the set of x values correspondins 
to all possible permutations of the 2” primary combinations 
for which the number is 2”!. 

Let us find x for a definite arrangement of the primary 
combinations in correspondence with the following pro. 
cedure. Let us write all the possible combinations fo: 
n = 1. These combinations are 0 and 1. Let us add te 
each of them first 0 and then 1. Then, we obtain a set o: 
all possible combinations for n = 2 as: 00, 10, 01, and 11 
Again let us add first 0 and then 1 to each of these combi: 
nations. Hence, we obtain the set of all possible combi: 
nations for n = 3 as 


000,;, 100, 010,110,001, 10% OL siite 


Adding first 0 and then 1 to each of these combinations 
we obtain the set of all possible combinations for n = 4 
Continuing this procedure further, we find the set of al 
primary combinations for n = 5, 6, 7, ete. 

EE. N. Zotova determined x in (2) for the arrangement 
described here of the primary combinations and calculatec 
the set of selected combinations of interest by the above: 
mentioned procedure. The results of these computations 
are given in Table I. 


TABLE I 
VALUES OF THE « = I(n, d) FUNCTION 
SOR: 
AY 1 | 4 5 6 7 8 9 10 
—— 
1 2 4) 8 | 16) 32>) 64 128 256 512) |) 1022 
2 —| 2 4 S| IG | By 64 128 256 ol2 
3 SS ee 2 4) 8 16 16 32 64 
4 = |= | = | Bl Bl 4 8 16 16 32) 
5 Hh 2 2 4 8 16 
6 2 2 2 4 8 
7 2 2, 2 2 
8 2 2 2 
9 2 2 
10 2 


In order to determine the degree of influence of thi 
arrangement of the primary combinations on x, Zotovs 
calculated this quantity for four pairs of fixed values o 
n and d and for ten different arrangements of the primary 
combinations. The results of these calculations are show1 
in Table II. ; 


TABLE II : 


| 
VALUES) OF 7 -= 


F(n, d) ror Various ARRANGEMENTS OF TH 
PRIMARY COMBINATIONS 


toi Wl 
DS 2 00 
QAQaaa 
| 
Bmw Rw 
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The data of these tables confirm the hypothesis cited 
above that the arrangement of the primary combinations 
affects the number of sets of combinations, sorted by the 
procedure described, for which any two differ from each 
other in not less than a given number of elements. 

It is easy to confirm that x depends on the arrangement 
of the primary combinations even for small values of n 
and d and, in particular, forn = 3 andd = 2. Actually, 
let us write the primary combinations for this case as 


000, 111, 100, 010, 110, 001, 101, O11. 


Using the cancellation procedure described above, we 
find that only two combinations remain, 000 and 111, 
which satisfied the condition presented. In other words, 
for the given arrangement of primary combinations 
z(3, 2) = 2 while it was shown above that for the arrange- 
ment established before x(3, 2) = 4. 

The geometric interpretation of this statement is given 
in Fig. 2. Here, two points are shown which correspond 


7h A 


(000) 


Fig. 2—Geometric interpretation of the relation 2(3, 2) = 2. 


to the two primary combinations 000 and 111 of sequence 
of eight primary combinations. The addition of any of 
the remaining six points to these two will, evidently, lead 
to the appearance of a pair of points different from each 
other not in two, but only in one coordinate. 

From Tables I and II it is easy to find the lower limit 
of x,,, in which we are interested, for small n and d values. 
Thus, for example, on the basis of the data of Table II 
it can be stated that 


Sono) 2 19) 


Laemmel, on the basis of the investigations of Hamming” 
and of other authors, gives relations which when used 
enables the limits between which the values of the z,,(n, d) 
function lies to be determined. These relations are the 
following: 


2" < 2,(n, 2A + 1) 
ea |.) 
< a (3) 
re alae aay) 
Ope 0, eee 2A 1) (4) 
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2 taltit 1)d) Sad) <a, e lee) (5) 

n(n, d + 1) < z,,(n, d) (6) 

Ln(t, di), ValM2; de) < enlan, + bre, ad, + bds) (7) 
aie) e— aoe Wain Absa Oe (8) 

2 ie AN 3) (9) 

a Mpg) ee OR NO eee (10) 

Grin, d) =A for d = 27, (11) 


Here A is the largest number of elements in which 
the distortion can be corrected automatically in each 
code combination; 


Ge) 


are the binomial coefficients 


etc., respectively; a and b are any positive integers; y 
is the least integer satisfying the inequality 


y > log, (n + 1). (12) 


The expression on the right side of the double inequality 
(3) was obtained by Hamming’ as the quotient of dividing 
the common number of 2” points lying on a sphere in 
n-dimensional space by the number of points contained 
in a small sphere representing one code combination. 

Setting n = 8 and A = 1 in (38), we obtain 


6 S108; oe 28: 


From Tables I and II above, it was found that z,,(8, 3) 
> 19. Consequently, it can be said that 


19k 2 (8, 3) es 


Using relations (3) to (11) and Tables I and II, the 
upper and lower limits between which the desired function 
x,(n, d) lies for small n and d values can be determined 
in an analogous way. 


NoIsk STABILITY OF A SYSTEM WITH ERROR-CORRECTING 
CopESs FoR A LARGE NUMBER OF ELEMENTS 


The formulas and tables cited in the preceding section 
do not permit a representation of the character of the 
x,(n, d) function to be formed in the region of large n 
values. Consequently, in this case they do not afford the 
possibility of calculating the capacity and the noise 
stability of a system with correcting codes. 

Let us formulate the problem of finding the inequality 
which can be of use in assessing the noise stability of a 
system with correcting codes for the case when the number 
n of elements is large. 

Let us denote the probability of the distortion, caused 
by noise, of any one element through p. In other words, 
p is the probability of zero transforming into one or one 
into zero in one element of the transmitted code combi- 
nation. 
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When a sequence of positive and negative currents 
(Fig. 5) are transmitted in the presence of noise, then p 


is the probability of the negative current transforming 


into a positive or a positive into a negative. 


+ 


Let us assume that the noise acts independently on 
each element of communication. In other words, we will 
assume that the probability of distortion of a given element 
by noise is not completely dependent on how all the 
remaining elements are subjected to distortion. 

Let us take a set of code combinations, each of which 
consists of n elements. Let us assume that any two 
combinations of this set differ from each other by not 
less than in d elements. 

When such a set is used to transmit information under 
noise conditions, each combination, generally, will be 
distorted by the noise. If any combination will be dis- 
torted to a small degree, then it will be transmitted without 
distortion if automatic error correction is used. If the 
degree of distortion of the transmitted combination is 
large, then the distorted combination can appear to be 
closer to the false than to the true. In this case, an error 
in the information transmission will occur even if auto- 
matic correction is used. Let us denote the probability 
of such an error by p, and let us pose the problem of 
finding the relation between this probability p, and the 
probability p of the distortion of a single element of 
information, the number 7 of all elements and the code 
distance d. 

Let us denote by u the number of distorted elements 
in one combination composed of n elements. If 

d 


B< 5 


9 (13) 


then, evidently, the distorted combination will be closer 
to the true than to any other and the information trans- 
mission will be without error. If 


(14) 


then the distorted combination can appear to be closer 
either to the true or to the false depending on the character 
of the distortion and the properties of the set of code 
combinations selected. 
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Actually, let us take the set of combinations shown | 
Table III, for example. 


TABLE III 
Copr CoMBINATIONS FOR n = 6 AND d = 3 


Index number 1 2 3 4 ii) 6 


Combination 100001; 001110) 111000) 010101) 011011} 11011 


Let us assume that combination no. 3 is transmitte 
and that uw = 2. Let us admit that the fourth and fift 
elements of this combination are subjected to distortior 
Then the distorted combination will be 111110. Com 
paring it with all six combinations shown in Table II] 
we obtain the respective differences in the numbers of th 
elements: 5, 2, 2, 4, 3, and 1. Hence, it is seen that th 
distorted combination no. 3 appears to be closest t 
combination no. 6. Consequently, the information trans 
mission in the given case will be with error. 

Now, let the first and second elements in the trans 
mitted combination no. 3 be subjected to distortion 
Then the distorted combination will be 001000. Comparin. 
it again with all six combinations of the table, we obtai 
the respective differences in the number of elements: € 
2, 2, 4, 3, and 5. Hence, it is seen that the distorte 
combination no. 3 appears to be closest to combination 
no. 2 and no. 38. Consequently, here the informatioi 
transmission will be either with error or without erro 
depending on whether the distorted combination no. 3 i 
transformed into combination no. 2 or no. 3. 

Let us take a code combination of n elements. I 
conformance with the Moivre-Laplace integral theorem, 
the number of elements distorted by noise satisfies th 
relation 


pe ‘a 2 aes Ce a} 
V np(l — p) 


1 i Ke | “| 
meas Sia exp | —a |d 15 
/ Qe a1 P 2 = ( ) 
where P is the probability of conserving the inequality i 
the braces. 


Here, putting a, = — ~, a, = aand taking into accoun 
that 


: ie exp | =| Oe al, aa ie ile | | d. 
ae x Sac: aie ae —= ex —— |Naz 
V 2r or 2 Ven a y 2 | 


we obtain 
P.. es 2 oh 
Vnp(l — p) 
il e : 
—-1- vera en | -5| dz. 
T a 


°B. V. Gnedenko, ‘Probability theory,’’ GITTL, 1950. 
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TABLE IV 
Copr Disrancr d ror VARIOUS VALUES OF pn AND 7p 
Ds Pp 
Ne 0.01 0.02 0.05 Om 0.2 0.3 0.4 0.5 
(ee 
ny = 0.1 10 1.006 1.534 2.766 4.432 7.242 9.714 11.97 14.05 
100 4.550 7.588 15.58 27.69 50.25 71.74 92.75 112.8 
1000 28 .06 51.35 Ls /aer 224.3 432.4 637.1 839.7 1041 
10000 255.5 435.9 1056 2077 4102 6117 8126 10130 
10 1.663 2.459 4.206 6.414 9.885 12.74 1520 17.36 
100 6.628 10.51 20.14 33.96 58.61 81.32 102.8 123.3 
» = 0.01 1000 34.63 60.59 132.1 244.1 458.8 667.4 872.0 1074 
10000 246.3 465.1 1101 2140 4186 6213 8228 10230 
10 2.144 3.136 5.258 7.863 11.82 14.95 eo 19.7 
100 8.148 12.65 23.47 38.54 64.72 88.32 110.3 130.9 
on = 0.001 1000 39.44 67.36 142.6 258.6 478.2 689.5 895.7 1098 
10000 261.5 486.5 1135 2185 4247 6283 8303 10310 
10 2.540 3.692 6.125 9.056 13.41 16.78 19.52 21.76 
100 9.400 14.41 26.21 42.31 COR 94.08 116.6 137.2 
mn = 0.0001 1000 43 . 40 72.92 Lot 2 270.6 494.1 ues 915.2 1118 
10000 274.0 504.1 1162 2223 4298 6341 8364 10370 


Let us select a in conformance with the equality 


d/2 — np 


** Vip = 9) < 


Then (16) can be represented thus: 


d 1 > z 
os {u < >] — val exp EA dz. (18) 


Since the probability of conserving one of the conditions 
wp < d/2 and p > d/2 is unity, then, in conformance with 
the theorem of the addition of probabilities 


d 1 a 2° 
po. . > 4 S we | exp |S eS) 


It was established above that the transmission of 
information will be without error for » < d/2 and can be 
both with and without error for u > d/2. In conformance 
with this statement and with (19), the probability, p,, 
of the distortion of a code group consisting of n elements, 
when automatic correction of the error is used, satisfies 
the inequality 


1m Sz ie exp || dz (20) 
or 
Pr < flp, n, d) (21) 
where 
1 P 2° 
fen) =f en|-G]ce a 


and a is expressed through (17). 


Carrying out calculations with (17) and (22) by making 
use of existing tables of the right side of (22) which are 
found in probability texts and putting the results on 
graphs, we obtain a family of graphs for which the data 
are given in Table IV. 

Using this table when the admissible probability p, of 
the distortion of a code group is given, the code distance 
d can be determined which would guarantee this prob- 
ability for the selected value of the number n of code group 
elements and for the known probability p of the distortion 
of one element. 

Thus, for example, being given p, < 0.01; p = 0-1, and 
n = 100, we obtain d % 34. Here, the auxiliary quantity 
a = 2.33. 

The (17), (21), and (22), which we found, and the 
Table IV computed therefrom permit the fundamental 
parameters of the correcting code to be calculated for 
sufficiently large nm values. A given noise stability of a 
coding system will be guaranteed with a certain margin 
when a code is used with parameters found by such a 
method. In order to determine the magnitude of this 
margin, further investigation must be made. 

In particular, the p, probability of distortion of a code 
group will be less than 0.01, in the example cited above, 
for a code distance d > 34. The minimum value of d 
guaranteeing a given p, probability of code group dis- 
tortion remains unknown here. It can only be stated that 
this minimum value is dmin < 34. 


On THE NUMBER OF CopE COMBINATIONS DIFFERING 
From Eacu OrTHer in Not Less THAN A GIVEN 
NumBer or ELemMEnNts (LARGE 7 CASE) 


One of the most important parameters of a system with 
a correcting code is the greatest number of code combi- 
nations for which any two differ from each other by not 
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less than a given number of elements. This parameter is 
expressed by the function 2,, (n, d) with values found in 
the first section of this work for small n and d. 

Let us pose the problem of finding the inequality which 
the x,,(n, d) function satisfies in the large n region. 

In order to solve this problem, let us select the code 
spacing 


d = 2n(p + 8) (23) 


where @ is any positive constant as small as desired. 
Substituting (23) into (17), we find 


Vn 
Vp(1 — p) 


Here, putting n — ©, we obtain a > o. Returning to 
(20) and taking into account that its right side approaches 
zero as a —> ©, we obtain p, — 0. This means that for a 
code spacing chosen according to (23), such a sufficiently 
large number n of elements in the code group can be 
found that the p, probability of its distortion can be made 
as small as desired for any value of the 8 parameter 
assigned as small as that which one would desire before- 
hand. 

The capacity C of the system expressed in binary units 
per unit time is by Shannon’s definition:' 


log. N(T) 


Co = thin T 


To0 


(24) 


where N(7’) is the common number of all different in- 
formation in time 7’. 


0.04 0,06 008 Ol iy] 


Fig. 4—Dependence of the highest capacity Co on the probability 
p of damage to one element. 


Selecting the length 7 of the communication element as 
the time unit (Fig. 4) and taking into account that in 
the system considered 


N(T):= x(n, d) (25) 
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we obtain 


log, x(n, d) 


Cas imn 


no n 


(26 


According to the above, the channel can be operated a: 
this capacity with the p, distortion probabilities as smal 
as desired, if n is large enough. 

Substituting (23) into (26), we obtain 


oe log. x(n, 2np + 2n8) 


no n 


(27 


On the other hand, it is known that for p < 0.5, the 
highest possible capacity Co of the system is expressed by 
Barnard: 


Co = 1+ p log, p 2G) = p) logs (1 ap cee 


which is shown graphically in Fig. 8.° 
According to the Shannon theorem,’ such a coding 
system does not exist which when used would guarantee 
the transmission of communications at rates exceeding 
C,) and with probable errors as close to zero as desired. 

Consequently 
CoG (29) 


or, taking into account (27), we obtain for large enough 7 


log, x,,(n, 2np + 2n@) Le (30 
0% 


n d 


from which, taking into account that this inequality is 
correct for 6 as small as desired, we obtain 


v (n 2np) < gittplogep+ (1—p) logs (1—p) In (31 
Here, putting 
2np = d 
we find 
x (n d) aa plltd/2hlog.d/2"+ (1—d2”) loga(1—d/2") In (32 
™m™ ? | 
or, after elementary transformations, we obtain 
Um, a) <2 (33 
where 
z= ga)n 


$(a) = 3a log, a + (2 — a) log, (2 — a)] 


7G. A. Barnard, “Simple Proofs of Simple Cases of the Codin 
rere paper read at third Symposium on Information Theory 
8 Translator’s note: No Fig. 8 appears in the paper. 
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Eq. (33) gives the solution of the problem which has 
ven formulated. They permit the determination, for 
ge n values, of the upper limit for the greatest amount, 
(n, d), of code combinations for which any two are 
erent from each other by not less than a given number 
elements. 
Shown in Fig. 5 is a graph of ¢(a) constructed according 
(33). Using this graph it is easy to determine the z/n 
tio for a given d/n. 
An analysis of (33) shows that the function 2’, the 
per limit of the desired z,,(n, d) function, approaches 
_ asymptotically for any constant d and n increasing 
ithout limit. If, conversely, n remains constant and d 
creases from 1 to n, then the 2° function decreases 
onotonically from 2” to 1. If n and d simultaneously 
crease such that their ratio a = d/n is constant, then 
const and the exponent in the 2° function will 
crease, approximately, directly proportionally to n. 
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Fig. 5—Graph of ¢(a). 


_ Two Inequalities Implied by Unique Decipherability’ 


BROCKWAY McMILLANT 


Summary—Consider a list of b words, each word being a string 

letters from a given fixed alphabet of a letters. If every string 
‘words drawn from this list, when written out in letters without 
iditional space marks to separate the words, is uniquely decipher- 
le, then 


GH eat he ig LY ef (1) 


here l;, 1 <7 < b, is the length of the ith word in the list. This 
sult extends a remark of J. L. Doob, who derived the same in- 
juality for lists of a more restricted kind. A consequence of (1) 
id work of Shannon is that this more restricted kind of list suffices 
the search for codes with specified amounts of redundancy. 


Discussion 


ET A be an alphabet of a letters. A finite nonvacuous 
sequence or string of letters of A will be called a 
word over A. The number of letters in a word will 

called its length, or length over A. Consider a finite 
mvacuous list B of words over A. This list B will be 
lled separable if, whenever a string of words of B is 
ritten out in letters, without space marks between the 
ords, the resulting string of letters is uniquely decipher- 
le into the original string of words. A condition necessary 
id sufficient for separability is given by Sardinas and 
utterson.* 


* Manuscript received by the PGIT, April 6, 1956. 

+ Bell Telephone Labs., New York, N. Y. 

1A. A. Sardinas and G. W. Patterson, ‘‘A necessary and suf- 
ient condition for unique decomposition of encoded messages,” 
53 IRE ConvEntION ReEcorD, pt. 8, pp. 104-108. 


The list of common English words, considered as 
words over the alphabet of 26 Latin letters, is not 
separable, as the sequences “together” and ‘‘to get her’ 
show. If each English word is considered as ending with 
a space mark, so that the words are over an alphabet of 
27 letters, then this list of words is separable. 

A strong sufficient condition for separability of B is 
that no word of B appears as the initial string of letters 
in a longer word of B. For convenience, call this property 
of B irreducibility. The list of common English words 
terminated by spaces is irreducible. The list (1, 10, 100) 
of words over the binary alphabet is separable but not 
irreducible. 

In an oral discussion,” Doob recently observed that the 
inequality (1) holds when B is an irreducible list. The 
main result of this note is a proof of (1) when B is merely 
separable. A strong converse result is also given: A con- 
struction used by Shannon shows that if l;, l2, +--+ l, are 
integers satisfying (1), then there exists an irreducible 
list B of b words whose lengths over A are respectively 
l,, lo, +++ ly. It follows as a corollary that if l,l, ---, by 
are the lengths over A of a separable list of words, they 
are also the lengths over A of an irreducible list. This 
rather algebraic conclusion results, as will be seen, from 
largely analytic arguments. 


2Comments given upon papers presented before a session on 
information theory at the summer meeting of the Inst. of Math- 
ematical Statistics, Ann Arbor, Mich., August 30, 1955. 
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As Doob observed, one interest of (1) is in the following 
application: Consider a universe of b events, respectively 
of probabilities p,, po, +++, DP». Suppose that a code is 
established in which each event is designated by a distinct 
word of Bb. If the results of a sequence of independent 
trials are recorded by writing the corresponding words 
of B in order without space marks, the expected number 
of letters written per trial is p,l, + pola -+ --- + Dols. 
If (1) holds; e.g., if B is separable, then one has a direct 
proof of the inequality 


b 


b 
di pili = — 2B p; log, pi. (2) 
This inequality, of course, also follows from the basic 
theorems of Shannon.* 

It follows from the corollary remark above that the 
set of values assumed by >>p,l; as B is varied over the 
class of separable lists coincides exactly with the set of 
values assumed when B is restricted to be irreducible. 
In particular, given a separable code, there exists an 
irreducible code for the same universe of events which 
has the same redundancy. 


PROOFS 


Given a separable list B, let 1 = max 1; (for 1 <7 < b), 
and let n, be the number of words of B which are of length 
r, 1 <r Sl. Then.(1) reads 


l 
DEW iene 
r=1 


What will be shown is that the polynomial Q(x) — 1, 
where 


(1’) 


O@)e= ss N,v 


has no zeros in the circle | x | < a‘ in the complex plane. 
In particular then, Q(x) — 1 has no zeros in the interval 
0 <2 <a‘. Since Q(x) is monotone and continuous for 
x => 0, and Q(0) = 0, (1’) follows. 

Let N(k) be the number of distinct sequences of words 
of B each of which, written as a string of letters, is of total 
length k over A. Since B is separable, each of these N(k) 
sequences of words gives a distinct sequence of k letters 
of A. Hence 0 < N(k) < a‘. A simple comparison test 
then shows that for any complex x such that | xa | < 1 the 
infinite series 1 + N(1)a + N(2)x? + - converges. 
This series, therefore, represents a function F(x) analytic 
inion <hr 

Suppose for a moment that k > /. Consider the N(k) 
strings of k letters of A mentioned above: those which are 
decipherable into sequences of words of B. They can be 


3C. E, Shannon, ‘‘A mathematical theory of communication,”’ 
Bell Sys. Tech. J., vol. 27, pp. 379-423, 623-656; July—October, 
1948. 
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partitioned into J subclasses C,, 1 <r < J, thus; Let 
consist of all strings whose first word is of length r. Th 
if r ~ s, C, and C, cannot overlap; if a string were 
both, it would be decipherable into two distinct sequen¢ 
of words. Since there are n, distinct words of length 
and N(k — r) possible distinct subsequent strings 
length k — r which are decipherable into sequences 
words of B, C, contains exactly n,N(k — r) distir 
strings of letters. Hence if k > 1 


N(k) = uN(k — 1) + mN(k — 2) + + 
Mb =e | 


It is easy to see that if one defines N(O) = 1, a 
N(—k) = 0 for k > 0, then (8) in fact holds for all k > 

Multiply (3) by x” and sum from k = 1 tok = o., 
| za | < 1, one gets F(x) — 1 = Q(x)F (a), or 


i 
1 — Q(z) 
Since /’(x) is analytic inside*| «| <a", Q(v) — 1 cann 
vanish in that circle. Hence (1’). 

Conversely, suppose that J, l,, ---, J, are intege 
satisfying (1). So enumerate them that l, </, <--- < 
It is easy to see that there is then an integer k > 0 su 
that 


Fa@= 


a*ta?+---+k+ a” =1. 


Letg,; =a ',1<i<b—1l,andq¢q, =a” ford <% 
b + k. Then qi, qs, +++, doi Qualify as an exhaustive li 
of probabilities. The construction of Shannon® for ; 
efficient binary code to transmit messages drawn fro 
a universe with the probabilities g; can easily be extend: 
to an alphabet A of a letters. This extension then d 
scribes the construction of an irreducible list of b + 
words, the first 6 of which have lengths 1,, /,, ---, 
Deleting the last k words leaves the list irreducible, az 
with words of the desired length. 

Finally to prove (2) from (1), it is necessary only 
invoke the inequality 


log, 2us(ei= I) losye, ( 


which is well known, and easily proved by elementa 
calculus. Then if p,, ---, p, are any nonnegative numbe 
which sum to 1, 


1 
Sie: log. 5 —~ Lpk = Xp, log, 


Px 
= > r(“ 


~ 1) lo nO = 
P : ; 


—Ik 


Te 
k 


Since equality in (4) occurs only when x = 
(2) can occur only if each p, = a” 
lL, = — log. py: 


1, equality 
, 2.¢., if and only 


CZ) 


Summary—This note discusses the problem of maximizing the 
ate of flow from one terminal to another, through a network which 
onsists of a number of branches, each of which has a limited capa- 
ity. The main result is a theorem: The maximum possible flow from 
2ft to right through a network is equal to the minimum value among 
ll simple cut-sets. This theorem is applied to solve a more general 
roblem, in which a number of input nodes and a number of output 
odes are used. 


of Fig. 1. The branches of the network might 
represent communication channels, or, more 
renerally, any conveying system of limited capacity as, 
or example, a railroad system, a power feeding system, 
wr a network of pipes, provided in each case it is possible 
o assign a definite maximum allowed rate of flow over a 
riven branch. The links may be of two types, either one 
lirectional (indicated by arrows) or two directional, in 
vhich case flow is allowed in either direction at anything 
ip to maximum capacity. At the nodes or junction points 
f the network, any redistribution of incoming flow into 
he outgoing flow is allowed, subject only to the re- 
triction of not exceeding in any branch the capacity, and 
~ obeying the Kirchhoff law that the total (algebraic) 
low into a node be zero. Note that in the case of infor- 
nation flow, this may require arbitrarily large delays at 
ach node to permit recoding of the output signals from 
hat node. The problem is to evaluate the maximum 
ossible flow through the network as a whole, entering at 
he left terminal and emerging at the right terminal. 


Sarat a two-terminal network such as that 


Fig. 1 


The answer can be given in terms of cut-sets of the 
letwork. A cut-set of a two-terminal network is a set of 
ranches such that when deleted from the network, the 
etwork falls into two or more unconnected parts with 
he two terminals in different parts. Thus, every path 


* Manuscript received by the PGIT, July 11, 1956. 

7 Elec. Eng. Dept. and Res. Lab. of Electronics, Mass. Inst. 
‘ech., Cambridge, Mass. 

t Lincoln Lab., M.I.T., Lexington, Mass. 

§ Bell Telephone Labs., Murray Hill, N. J., and M.I.T., Cam- 
ridge, Mass. 
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A Note on the Maximum Flow Through a Network’ 


P. ELIAS}, A. FEINSTEIN}, AND C. E. SHANNONS 


from one terminal to the other in the original network 
passes through at least one branch in the cut-set. In the 
network above, some examples of cut-sets are (d, e, f), 
and (b, c, e, g, h), (d, g, h, 7). By a simple cut-set we will 
mean a cut-set such that if any branch is omitted it is no 
longer a cut-set. Thus (d, e, f) and (0, ¢, e, g, h) are simple 
cut-sets while (d, g, h, 7) is not. When a simple cut-set is 
deleted from a connected two-terminal network, the net- 
work falls into exactly two parts, a left part containing the 
left terminal and a rzght part containing the right terminal. 
We assign a value to a simple cut-set by taking the sum of 
capacities of branches in the cut-set, only counting 
capacities, however, from the left part to the right part 
for branches that are unidirectional. Note that the 
direction of an unidirectional branch cannot be deduced 
from its appearance in the graph of the network. A branch 
is directed from left to right in a minimal cut-set if, and 
only if, the arrow on the branch points from a node in the 
left part of the network to a node in the right part. Thus, 
in the example, the cut-set (d, e, f) has the value 5 + 1 = 6, 
the cut-set (b, c, e, g, h) has value 3 + 2+ 3+ 2 = 10. 

Theorem: The maximum possible flow from left to right 
through a network is equal to the minimum value among 
all simple cut-sets. 

This theorem may appear almost obvious on physical 
grounds and appears to have been accepted without proof 
for some time by workers in communication theory. 
However, while the fact that this flow cannot be exceeded 
is indeed almost trivial, the fact that it can actually be 
achieved is by no means obvious. We understand that 
proofs of the theorem have been given by Ford and 
Fulkerson’ and Fulkerson and Dantzig.” The following 
proof is relatively simple, and we believe different in 
principle. 

To prove first that the minimum cut-set flow cannot be 
exceeded, consider any given flow pattern and a minimum- 
valued cut-set C. Take the algebraic sum S of flows from 
left to right across this cut-set. This is clearly less than or 
equal to the value V of the cut-set, since the latter would 
result if all paths from left to right in C were carrying 
full capacity, and those in the reverse direction were 
carrying zero. Now add to S the sum of the algebraic 
flows into all nodes in the right-hand group for the cut- 
set C. This sum is zero because of the Kirchhoff law 
constraint at each node. Viewed another way, however, 
we see that it cancels out each flow contributing to S, 
and also that each flow on a branch with both ends in the 


1],. Ford, Jr. and D. R. Fulkerson, Can. J. Math.; to be published. 

2G. B. Dantzig and D. R. Fulkerson, “On the Max-Flow Min- 
Cut Theorem of Networks,” in ‘Linear Inequalities,’ Ann. Math. 
Studies, no. 38, Princeton, New Jersey, 1956. 
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right hand group appears with both plus and minus signs 
and therefore cancels out. The only term left, therefore, 
which is not cancelled is the flow out of the right hand 
terminal, that is to say, the total flow F through the 
network. We conclude, then that / < V. 

We now prove the more interesting positive assertion 
of the theorem: That a flow pattern can be found which 
actually achieves the rate V. From any given network 
with minimum cut-set value V it is possible to construct 
what we will call a reduced network with the properties 
listed below. 


1) The graph of the reduced network is the same as 
that of the original network except possibly that 
some of the branches of the original network are 
missing (zero capacity) in the reduced network. 

2) Every branch in the reduced network has a capacity 
equal to or less than the corresponding branch of the 
original network. 

3) Every branch of the reduced network is in at least 
one cut-set of value V, and V is the minimum value 
cut-set for the reduced network. 


A reduced network may be constructed as follows. If 
there is any branch which is not in some minimum cut-set, 
reduce its capacity until either it is in a minimum cut-set 
or the value reaches zero. Next, take any other branch not 
in a minimum cut-set and perform the same operation. 
Continue in this way until no branches remain which are 
not in minimum cut-sets. The network then clearly 
satisfies the condition. In general, there will be many 
different reduced networks obtainable from a given net- 
work depending on the order in which the branches are 
chosen. If a satisfactory flow pattern can be found for a 
reduced network, it is clear that the same flow pattern 
will be satisfactory in the original network, since both the 
Kirchhoff condition and the capacity limitation will be 
satisfied. Hence, if we prove the theorm for reduced 
networks, it will be true in general. 

The proof will proceed by an induction on the number 
of branches. First note that if every path through a re- 
duced network contains only two or less elements, the 
network is of the form shown typically in Fig. 2. In 


Fig. 2 


general, such a network consists of a paralleling of series 
subnetworks, these series combinations being at most 
two long with or without arrows from left to right. It is 
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obvious that for such a reduced network, the theorem 
true. It is only necessary to load up each branch 
capacity. Now suppose the theorem true for all redue 
networks with less than n nodes. We will then show th 
it is true for any reduced network with n nodes. 

Either the given reduced network with n nodes has 
path from left to right of length at least three, or it is 
the type just described. In the latter case the theorem 
true, as mentioned. In the former case, taking the secon 
branch on a path of length three, we have an eleme 
running between internal nodes. There exists (since tl 
network is reduced) a minimum cut-set containing tk 
branch. Replace each branch in the cut-set by ty 
branches in series, each with the same capacity as tl 
original branch. Now identify (or join together) all 
these newly-formed middle nodes as one single nod 
The network then becomes a series connection of ty 
simpler networks. Each of these has the same minimu 
cut-set value V since they each contain a cut-set corr 
sponding to C, and furthermore neither can conta 
higher-valued cut-sets since the operation of identifyi 
nodes only eliminates and cannot introduce new cut-set 

Each of the two networks in series contains a numb 
of branches smaller than n. This is evident because of tl 
path of length at least three from the left terminal to tl 
right terminal. This path implies the existence of a branc 
in the left group which does not appear in the right grot 
and conversely. Thus by inductive assumption, a sati 
factory flow pattern with total flow V can be set up - 
each of these networks. It is clear, then, that when tl 
common connecting node is separated into its origin 
form, the same flow pattern is satisfactory for the origin 
network. This concludes the proof. 

It is interesting that in a reduced network each bran 
is loaded to its full capacity and the direction of flow 
determined by any minimum cut-set through a brane 
In nonreduced networks there is, in general, some freedo 
in the amount of flow in branches and even, sometime 
in the direction of flow. 


| 
Fig. 3 : 


A more general problem concerning flow through ' 
network can be readily reduced to the above resul 
Suppose we have a network with a number of input nod 
and a number of output nodes as in Fig. 3. The three nod 


956 


n the left are inputs and it is desired to introduce two, 
hree, and six units of flow at these points. The nodes on 
he right are outputs and it is desired to deliver three and 
ight units at these points. The problem is to find con- 
‘itions under which this is possible. 

This problem may be reduced to the earlier one by 
dding a channel for each input to a common left-hand 
iode, the capacity of the channel being equal to the input 
low, and also introducing channels from the outputs to a 
ommon right-hand node with capacities equal to the out- 
yut flow. In the particular case this leads to Fig. 4. The 


3 


Fig. 4 


1etwork obtained in this way from the original problem 
vill be called the augmented network. 

It is easy to show that necessary and sufficient con- 
litions for solving this multiple input multiple output 
roblem are the following: 


Campbell: Rectification of Two Signals in Random Noise 
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1) The sum of the input flows must equal the sum of 
the output flows. Let this sum be C. 

2) The minimum cut-set in the augmented network 
must have a value C. 


To prove these, note that the necessity of 1 is obvious and 
that of 2 follows by assuming a flow pattern in the original 
network satisfying the conditions. This can be translated 
into a flow pattern in the augmented network, and using 
the theorem, this implies no cut-set with value less than C. 
Since there are cut-sets with value C (those through the 
added branches), the minimum cut-set value is equal to C. 

The sufficiency of the conditions follows from noting 
that 2 implies, using the theorem, that a flow pattern can 
be set up in the augmented network with C in at the left 
and out at the right. Now by Kirchhoff’s law at the right 
and left terminals and using condition 1, each added 
output branch and input branch is carrying a flow equal 
to that desired. Hence, this flow pattern in the original 
network solves the problem. 
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Rectification of Two Signals in Random Noise’ 


L. LORNE CAMPBELLT 


Summary—The spectrum of the output of a half-wave rectifier 
s derived for an input which is the sum of random noise and two 
inusoidal signals of different frequencies. The method used is 
he characteristic function method described by Rice. The com- 
nents of the output spectrum are given as infinite series of 
lypergeometric functions. If both the input signals are small 
‘ompared with the noise, it is shown that the ratio of the output 
signal power at the difference frequency to the output noise power 
S proportional to the product of the input signal-to-noise power 
atios at the two frequencies. If one of the input signals is very 
arge compared with the noise, it is shown that the other signal 
ind the noise are translated in frequency without alteration of 
he signal-to-noise ratio. A correction factor is obtained for the 
ase where the large signal is not quite large enough. Finally, the 
utput signal-to-noise ratio of a single-sideband detector is calcu- 
ated as a function of the input signal-to-noise ratio, when the 
sideband amplitude is one-half the carrier amplitude. 


* Manuscript received by the PGIT, May 3, 1956. Work per- 
ormed under project PCC no. D48-95-12-21. 
+ Radio Physics Lab., Defence Research Board, Ottawa, Canada. 


List or PrINcIPAL SYMBOLS 


P, Q—amplitudes of input signals of frequencies p/2z, 
q/ 2. 

V (t)—total input voltage. 

V.()=P cos pt + Q cos gt—total input signal. 

Vy(t)—input noise voltage. 

a, v—rectifier parameters, defined by J = aV" for V > 0, 
I = Ofor V < Owhere V is the input and / the output. 

w(f)—power spectrum of input noise. 

Yo—input noise power. 

v, = f% w(f) cos 2rfr df—autocorrelation function of 
input noise. 

W(r)—autocorrelation function of the rectifier output. 

WV..(r)—autocorrelation function of the portion of the 
rectifier output with a discrete spectrum. 
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WV.(7)— autocorrelation function of the portion of the 
rectifier output with a continuous spectrum. 

W(/)— power spectrum of rectifier output. 

W.(/)—eontinuous portion of output spectrum. 

W.».(/)—low frequency output noise spectrum. 

g(u, v, 7)—characteristic function of the probability 
density of V(t), V(t + 7). 

g.(u, v, r)—characteristic function of V,(#), V,(¢ + 7). 

v= P*/2o, y = Q’/2v>—input signal-to-noise power ratios 
for the two input signals. 

k, m, n, s—integers. 

C— contour along real axis with downward indentation at 
the origin, 

G.(f) = [3 @,)* cos 2arfr dr. 

Tnnk amplitudes of various spectral components in output. 

eo tee —e ior n> OL 

J,,(x)— Bessel function of order m. 

T'(7)— Gamma function. 

F(a; c; x)—confluent hypergeometric function. 

oI’, (a, b; c; )—hypergeometric series. 

fa = | p — q |/2x—the difference frequency of the two 
signals. ; 

7,— output signal at frequency f,. 

8—input bandwidth. 

Af—output bandwidth. 

poutput signal-to-noise power ratio in the band Af. 

Wy—Input noise power per unit bandwidth when input 
noise spectrum is rectangular. 

A\—modulation index of single-sideband signal, Section 
VI only. 

w/2nr—carrier frequency of single-sideband signal. 

Aw/27—modulation frequency of single-sideband signal. 


I. InrRoDUCTION 


HE SPECTRUM of the output of a »v-th law 

half-wave rectifier when the input is a sinusoidal 

signal combined with Gaussian noise has been deter- 
mined by Rice’ and Middleton.”'* Middleton also gives the 
output spectrum when the input is an amplitude-modu- 
lated signal combined with noise. Rice gives the output 
from a full-wave square law rectifier when the input is 
noise plus a sinusoidal signal, the sum of two sinusoidal 
signals, or an amplitude-modulated signal. 

In this paper we derive the spectrum of the output 
from a v-th law half-wave rectifier when the input is the 
sum of two sinusoidal signals of different frequencies and 
Gaussian noise. Thus, let the input voltage be 


V(t) = P cos pt + Q cos qt + Vy(d), (1) 


18. O. Rice, “Mathematical analysis of random noise,” Bell 
Sys. Tech. J., vol. 24, pp. 46-156; January, 1945. 

2D. Middleton, ‘Rectification of a sinusoidally modulated 
carrier in the presence of noise,’”’ Proc. IRE, vol. 36, pp. 1467-1477; 
December, 1948. 

3], Middleton, “Some general results in the theory of noise 
through non-linear devices,” Quart. Appl. Math., vol. 5, pp. 445- 
498; January, 1948. 
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where Vy is a noise voltage. We derive the power spectrur 
of 7(¢t) where 


VO) 20 
Vigee0: 


for 


= al VOT 
= 0 


I(t) 


for 


General expressions will be obtained for arbitrary v an 
some special cases will be treated in more detail for th 
linear half-wave rectifier (v = 1). 

An input voltage of the form (1) may be found in sue 
devices as single-sideband detectors and mixers, and 1 
some forms of phase detector. The noiseless case has bee 
treated theoretically by Bennett* for half-wave linear an 
square law rectifiers. The resulting modulation product 
for these cases have been tabulated by Sternberg, Shipmar 
Thurston, and Kaufman.’’® 

The power spectrum of the output of the rectifier - 
derived in Section II and in Section III these results a1 
put into a form which is more useful for calculation. I 
Sections IV, V, and VI the special cases of small signal: 
large signals, and single-sideband detection are considere¢ 


II. DERIVATION OF THE OvuTPUT SPECTRUM 


We use the characteristic function method describe 
by Rice.’ He has already derived the de output and th 
total low-frequency output by a slightly different method 
for the case of a linear rectifier. 

The autocorrelation function, V(r), of the output ¢ 
the rectifier is given’ by 
a Ty + idu fTv~t+ 1) 


5 (us v; 7) dv, Tie 
Aor é (iu)’*? C (iv) v+1 g ) 


V(r) = 
where 


gu, v, 7), = lim ll exp [uV(t) + wV(t + 7)] dt, ¢ 


T0 


and the contour C is the real axis from — © to © with 
downward indentation at, the origin. The output powe 
spectrum, W(f), of Y(r) is given by the transform 


Wf) = 4 [ “Cy case ( 


If Vy represents Gaussian noise, the characterist 
function, g(u, v, 7), is given by 


glu, v, T) == gs(u, v, T) exp | -# (u” tg v) =A vow | (( 


*W. R. Bennett, ‘““New results in the calculation of modulatic 
products,” Bell Sys. Tech. J., vol. 12, pp. 228-243; April, 1933. 

°>R. L. Sternberg, J. S. Shipman, and W. B. Thurston, ‘“Tabl 
of Bennett functions for the two-frequency modulation produ 
problem for the half-wave linear rectifier,’ Quart. J. Mech. a 
Appl. Math., vol. 7, pp. 505-511; December, 1954. 

5 R. L. Sternberg, J. S. Shipman, and H. Kaufman, ‘‘Tables 
Bennett functions for the two-frequency modulation produ 
problem for the half-wave square-law rectifier,’ Quart. J. Mec 
and Appl. Math., vol. 8, pp. 457-467; December, 1955. 

7S. O. Rice, op. cit., sections 4.8 to 4.10. 

8 Ibid, section 4.2. 
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where y, is the autocorrelation function of the noise, vo 
is the mean square value of the noise 


BU, v, 7) = lim zt exp [iwV.(t) + wV.(t + 7)] dt (7) 


TO 0 
and in the case of the input (1) 


V.() = P cos pt + Q cos qt. (8) 


Now, it is easily shown from the generating function 
for Bessel functions that 
exp (iuP cos pt) = >> i"e,J,(Pu) cos npt, (9) 
n=0 
wnere ¢_,5 = 1 and «, = 2 for n > OQ. Thus, in order to 
evaluate the integral of (7), it will be seen that we must 
consider integrals of the form 


TT 
| oe if cos Nypt cos noqt 
0 


-cos nyp(t + 7) cos nag(t + 7) dt (10) 


It will be assumed that p and q are incommensurable. 
That is, it is assumed that mp + nq # 0 for ane integers 
m and n. Then when we form 


aera | 
lim T Pes eee 


Tow 


the only terms which contribute are those for which 


ft, = nz; and ne = M, since all other integrals contain 
only terms like cos [(n3 + 11) pt + n3p7] and cos [(m4 + 2) 
qt + naqr]. 


Now 
- 


[ [cos (Qnipt + npr) + cos npr] 
0 


| en 7 


-[cos (2Qn2qgt + noqr) + cos neg] dt, (11) 
and hence 
a, 1 
lim T Yn ees COS NPT COS NsqT. (12) 
T0 En En» 


Therefore we have 


u,v, 7) = e 2 (Ae ee IME U) Ske v) dl Qt AQu) 


“COs mpT COS NqT. (13) 


In order to evaluate (3) we expand e *"” 


series. We have 


a" g.u,¥, 7) = ae y(- 


m=0 n= 


Nee eres Ce Qe) 


COS MpT COS NT. (14) 


Jnl P0) TAQ) ee 


Therefore, from (3), (6), and (14), 
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(os) (ce) fo) k 
V(r) = a 2 ye me En€, COS MPT COS NgT ae ty sie Cs) 
where 
Tinnk = ae Tet 1) 


| eo ial Pu (Que ue (16) 
Cc 

The output spectrum is given by (5). It is convenient 
to separate the discrete spectrum [de component and 
components of frequency (mp + ngq)/2z] from the con- 
tinuous spectrum. The discrete spectrum is obtained from 
the terms in (15) which do not vanish as r > o. These 
are the terms for which k = 0, since ¥, ~ 0 as r— o. 
Thus we shall write 


UO) ea) es (17) 


where 


aw D> DY €n€xno COS MPT COS NGT, (18) 


m=0 n=0 


Walz) = 


and W,(7) is the portion of V(7) which contributes to the 
continuous spectrum. 
Now, 


ar py ELeT onl cos (mp _ NQ)T 


m=0 n=0 


w|&, 


+ cos (mp — nq)r]. (19) 


Thus the amplitude of the de component of the output, J, 
1S 
(20) 


Ipn.c. = @ | Too | 


and the amplitude of the components of frequencies 


(mp + nq)/2x and | mp — ng |/2m are 
I Meee = 2a Weer hs (21) 


The continuous spectrum, W,(f), is obtained from 
WG)=4 | Tac tees aaa (5’) 


Therefore, 


. Dan 


‘| @ 


where 


= i (p,)" cos 2afr dr. (23) 
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The functions @,(f) are discussed and tabulated by Rice?® 
for three types of predetector filter. 


III. EvaALuaTION OF THE CONSTANTS fnng 


It is necessary to obtain a more useful expression for 
Tmnk {rom (16). Two infinite series expressions for ?mnz Will 
be derived here. 

The first expression is obtained by expanding J,(Qu) 
in a Maclaurin series. We obtain 


ee ioe hen 2 
We Qn seen 


s=0 


| Bae J (Pui eee du. (24) 
Cc 


The integrals in (24) may be evaluated by means of a 
formula given by Rice.’ The result is 


Pps Dy eke eee 
2m! (2) ue 


Tmnk = 


1,0 /2ik + m+n +. 28 = vpn te 1; 2) (25) 
=o «6 Si(s + n)!TA/2[2 — k — m — n — 2s + J) : 
where 
eee Balik 
oe 20 28) 


The quantities « and y are signal-to-noise power ratios 
at the input for the two signals P cos pt and Q cos gt. The 
function ,F, (a; c; z) isa confluent hypergeometric function, 
defined by 


a(a + 1) 2 
c(c + 1) 2! 


(Gs ye) We +--+, (27) 
It may be noted that, if k + m + n — rv is an even 
integer, the series in (25) terminates because of. the 
Gamma function in the denominator. 
An alternative series for 7,,,, may be obtained by 
expanding the product J,,(Pu) J,(Qu) as follows.”° 


(Pu/2)"(Qu/2)” 


n! 
SS CeV'Cu/2" ahi(=s, =m = syn + 1: IPD, 
'(m ! 
s!(m + s)! (28) 
where ;F',(a, b; c; z) is the hypergeometric series, given by 
abz 


Oe re a(a + 1)b(b + 1) 2 
ali(a, by e;2) = 1 + C + eho ml 


In(Pu)T.(Qu) = 


s=0 


(29) 


The series (29) terminates if a or b is a negative integer. 
If we insert the series (28) in (16) and use the result’ that 


-Nir 


ime 


i Ce a ee du = 5 
c a T(1 — d) 


(30) 


9 [bid., appendix. 
10 G, N. Watson, ‘‘Theory of Bessel Functions,’’ The Macmillan 
Co., New York, N. Y., p. 148; 1948. 
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we obtain 


= CRS ali) aam/2, n/2 (2)°" 
Vinnk rg Qn! v Y Wo 

oS of (8) — i — Sh el yee 

Ad slim + 8) !T(1/2[2 — k — m — n — 2s + »]J) 


-& 
Each 7, function in (31) is a finite polynomial in y/z. 


IV. Smaui Input Sianat-To-NoisE Ratio 


In this section and the succeeding sections we sha 
assume that the desired output is the signal with frequenc 
fa = | p — q |/2m and that post-detection filters remov 
all the other signals and all the noise except that in a ban 
around the desired signal. Then the output signal, J,, 
given by 


(3: 


We shall also assume that the noise input, Vy, has a re¢ 
tangular spectrum given by 


Ih = PeeIP> soe 


fr-5<lft<ft+s @ 


w(f) = 0 


where the frequencies p/2zm and q/2z are between fy — 6B/ 
and fy + 6/2. Then, as shown by Rice,” 


elsewhere, 


pr _ vo es B , 
Gi) = 2 foo aif 
Gif) = 0 elsewhere, 
and the low-frequency portion of G,(f) is given by 
G() = 0-1/8) O<f<B 
G.(f) = 0 Wiad ees (32 


We shall consider a relatively narrow-band system; tha 
is, we assume 8 XK p,| pp —q| <p. 

In this section we consider the case where the inpu 
signal-to-noise ratios « and y are both small compare 
with unity. The output signal-to-noise ratio in a band 
width Af about the signal will be obtained here. 

From (25) it is seen that, for < 1 and y < 1, th 
signal amplitude is given approximately by 
aly + 1) (% we 1/2 

T0/D) (3) oy 
The principal contribution to the low-frequency nois 
comes from the beats of noise with itself, 7.e., from th 
term in (22) involving roo2. The noise power in the ban 
Af is given approximately by 


I, = 20fi19 = 


N=alrin f(A) +G(-Dlaf, Gi 


a—Af/2 


[ea Tey 40-9) 


or 
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The output signal-to-noise power ratio, p, is given by 


ee Co, 
peeeoNer Ataf ie) 


(39) 


or fa < 8. This result is analogous to the result obtained 
yy Middleton’ for the rectification of amplitude modu- 
ated waves, v2z., that the output signal-to-noise ratio is 
oportional to the square of the input signal-to-noise 
atio for small values of the input signal-to-noise ratio. 
[his result is independent of the exponent v in the detector 
aw. 


V. One Larce Input SignaL—LInEAR RECTIFIER 


In this section we consider the output spectrum of a 
inear (v = 1) rectifier when x > 1 and x > y. This is the 
‘ondition which normally holds if the signal P cos pt is 
upplied by a local oscillator. For this purpose the 
isymptotic relation” 


Be ro _| aa = 6 + 1) 
a eae 


eee, | 


p 2! 
(p > 0) 


me 0. Pp) = p 


os ala + 1)\(a — 


(40) 


s useful. 

From (25) and (40) it is seen that as x > o, the terms 
# highest order in the spectrum are those proportional to 
00 (m = 0, 1, 2, ---). These terms are just the de and 
he harmonics of P cos pt. As a check on the results, it is 
asily shown that the amplitudes of the de and harmonics 
ree with those calculated from the Fourier series of the 
yutput with Q and Vy equal to zero. 

Finally, we shall consider the low-frequency spectrum 
when «x is large. Then 


iseky = & E a é ae O(x’) , (41) 


io that 


jhe — Qatano SAO Q E = Ba a ow) |. (42) 
™ 8x 

[The principal contribution to the low-frequency noise 

spectrum (39) is from rio:. For x > 1, 


il il -2 
Tio1 ae E wha? + O@ )|. (43) 


(ll other contributions are of order x” or smaller. Now, 
f w(f) is the input spectrum, it is seen easily from (5) and 
23) that 


wf), 


Gif) =p (44) 


hus, the low-frequency noise spectrum is, approximately, 


Wirt =arn [ols +2)+u(e-2)] oo 


Campbell: Rectification of Two Signals in Random Noise 


123 


It will be seen that the noise spectrum has been split into 
two parts, which have been shifted up and down, re- 
spectively, in frequency by an amount p/27. The noise 
power spectrum has, from (43), been multiplied also by 
a factor a°/r’ for large x. At the same time the signal 
amplitude, /,, has been multiplied by a factor a/z so 
that the signal power is multiplied by a7/z’. 

Thus, when z is large enough, the effect on Q cos gt + Vy 
is just to shift the frequency spectrum of this voltage 
without altering the shape of the spectrum or the relation. 
between signal and noise. 

When P is not quite large enough to shift Q cos gf + Vy 
in frequency without distortion, a more nearly exact 
expression can be obtained by including more terms in 
the asymptotic expansions. For example, consider the 
case where the input spectrum is rectangular with a width 
which is large compared with | p — q|/2z. For convenience 
we will assume that the input noise has a spectrum given 
by (83), with the center frequency fy equal to p/2x and 
with 8 >| p — gq |/2r. In this case, the low-frequency 
noise in the output in a band of width Af about the signal 
frequency is given approximately by 


N == aj] si.0(2) + 8.02) se Zriat(0) | (46) 


or 


a Yo ANA 
26 


If the asymptotic expansion (40) is used to calculate the 
coefficients 7101, To11, ANd foo. an approximate expression 
can be obtained for the output noise. Then the output 
signal-to-noise ratio, p, is given by 


= AL opps ( ) 
aH ae oVW pe oe iy)" 


N = 


(aren =m arena ae Vorzoal (47) 


(48) 


According to (48) the output signal-to-noise ratio, p, 
is just equal to the input signal-to-noise ratio, y, multi- 
plied by the bandwidth factor 6/2Af, provided that P, 
and hence 2, is large enough. If P is not quite large enough 
to transfer the signal, Q cos gi, in frequency without 
altering the signal-to-noise ratio, the first-order correc- 
tion factor is 1 — 1/82. 


VI. Stneun-SIpDEBAND RECEPTION—LINEAR RECTIFIER 


In this section we consider the detection of the signal 


V = E,[coswt + A cos w + Aw)t] + Vy (49) 


in a linear rectifier. Bridges” has treated a similar problem 
by a different method. It will be assumed that Aw < w. 
Such a signal could arise in single-sideband transmission 
with unsuppressed or partially suppressed carrier. Hy cos 
wt will be referred to here as the carrier and x will be the 
carrier to noise ratio, given by 


uJ, E. Bridges, “Detection of television signals in thermal 
noise,” Proc. IRE, vol. 42, pp. 1896-1405; September, 1954. 


124 
Eo ~ 
2S as 50 
a, (50) 
Then, in the notation used earlier, 
Eon : 2 
y= == NGS dl 
Pero (51) 


Only the terms in the output low frequency noise 
spectrum which involve the coefficients 702, 7101, ANd Toi 
will be considered here. These terms represent, respective- 
ly, the contributions due to beats of noise with noise, 
noise with carrier, and noise with sideband. The approxi- 
mation of the series in (22) by three terms is equivalent 
to the approximation made by Middleton” in his calcu- 
lation for double-sideband amplitude modulation. It will 
be assumed that the width of the input noise spectrum is 
large compared with the modulation frequency Aw/2r 
and that the signal is near the center of the input spectrum. 
Then the output noise power in a band Af about the signal 
is given approximately by (47) with y given by (51). 
Hence the output signal-to-noise ratio is given by 


ae 4Brii0. . 
PAF Worcoe =i Arion ate 4ro11] 


(52) 


The coefficients 7,,,, were determined as functions of x 
for \ = 1/2 with the aid of (25). The confluent hyper- 
geometric functions were obtained from tables prepared 
by Middleton and Johnson.” The output signal-to-noise 
ratio, p, is plotted in Fig. 1 as a function of carrier-to-noise 
ratio, x, for \ = 1/2. The units of p are 8/2Af so that the 
signal-to-noise ratio shown must be multiplied by the 
bandwidth factor B/2Af. 

The form of single-sideband reception in which the 
carrier is supplied by a local oscillator follows easily from 
the results of Section V. 


VII. Discussion or RESULTS 


The formulas of Sections II and III may provide useful 
methods of computing the output of a half-wave rectifier 
when the input consists of the sum of two sinusoidal 
signals and random noise. The usefulness of these formulas 
depends, in part, on the availability of tables of the 
confluent hypergeometric functions. Almost all the 
functions which occur in (25) are of the form ,F, (M/2; 
N; — x), where M is an odd integer and N is an integer. 


27), Middleton and VY. Johnson, “A Tabulation of Selected 
Confluent Hypergeometric Functions,’ Cruft Lab. Tech. Rep. no. 
140; January, 1952. 
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Fig. 1—Output signal-to-noise ratio as a function of input carrier 
to-noise ratio for single-sideband detection with X = 1/2. 


The terms for which M is an even integer drop out, ir 
almost every case, because of a Gamma function with ¢ 
nonpositive integer argument in the denominator. As 
mentioned before, Middleton and Johnson’” have tabu. 
lated many confluent hypergeometric functions of the 
required form. It may also be mentioned that ,F, (M/2 
N; — x) can be expressed in terms of the modified Besse. 
functions J, («/2) and J, («/2). Tables of these Bessel 
functions are generally available. 

The result of Section IV agrees with the result obtained 
for other signals. The result is that a half-wave detector 
seriously degrades the signal-to-noise ratio if the input 
signal-to-noise ratio is too low. This is the familia 
phemonenon of modulation suppression. 

Section V yields the well-known result that a signal and 
noise can be translated in frequency without modifying 
the relationship between the signal and noise provided 
that the oscillator signal is large enough. If the signal 
provided by the local oscillator is not quite large enough 
to do this, (48) may be used to obtain a first-order ap- 
proximation to the signal-to-noise ratio at the new 
frequency. Better approximations could be obtained by 
the use of more terms in the asymptotic expansions. | 

Section VI provides one example of the type of calcu- 
lation which must be performed when it is not possible 
to use the approximations of Sections IV or V. 
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Optmum Detection of Random Signals in Noise, with 
Application to Scatter-Multipath Communication, I’ 


ROBERT PRICEt 


Summary—Solutions are obtained in open form for the optimum, 
srobability-computing detector of either Gaussian signals, or 
snown signals transmitted via scatter-paths, where the signals 
nave been further perturbed by additive white Gaussian noise. 
The optimum receiver operates on the received waveforms with 
ilter-functions and biasing constants determined by pairs of 
nhomogeneous and homogeneous integral equations, respectively. 

General solution in closed form has not been obtained, but it 
S possible to draw a few broad conclusions, among them that the 
ilter-functions can be physically realizable. Approximate solution 
for the optimum scatter-path receiver) at small signal-to-noise 
‘atios yields a block diagram having interesting implications. For 
1 single-scatter-path, the optimum receiver may be interpreted 
is the combination of a correlator with an optimum estimator of 
he Wiener type. Certain special cases in which complete solution 
s possible have been investigated in detail, and appropriate curves 
are presented. 

The role and performance of the probability-computing detector 
n an optimum decision-making receiver, for the types of channels 
onsidered, is deferred to a companion paper. 


INTRODUCTION 


| HE PROBLEM of detecting the presence or absence 
i} of a Gaussian signal transmitted through a channel 
containing additive Gaussian noise has received 
uttention from several authors.’ ° Relatively few explicit 
results, however, appear to have been obtained on the 
vest detection procedure for such random transmitted 
signals, in comparison with similar studies’’’’* that have 
lealt with the detection of transmitted signals whose 
waveforms are completely known to the receiver. 
Heaps’ analysis’ for Gaussian signals in Gaussian 


* Manuscript received by the PGIT, August 31, 1956. The re- 
earch in this paper was supported jointly by the Army, Navy, 
Fo Air Force under contract with Mass. Inst. Tech., Cambridge, 
Vass. 

+ Lincoln Lab., M.I.T., Cambridge, Mass. 

11P, Middleton, “‘On the detection of stochastic signals in additive 
10rmal noise, I,”’ to be published in this journal at a future date. 

2R. C. Davis, “The detectability of random signals in the 
yresence of noise,” IRE Trans., PGIT-3, pp. 52-62; March, 1954. 

3. L. Kaplan, “Signal detection studies, with applications,”’ 
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noise, an interesting contribution to noise theory, does not 
appear applicable to the optimum detection problem. 
Kaplan’s results’ apply to a particular detection scheme 
which is, in general, not the best one attainable. In 
the work of Peterson, Birdsall, and Fox,’ a best detection 
scheme is derived for the very special case in which signals 
and noise have rectangular power spectra of equal width. 

It appears to be only in the work of R. C. Dayvis,? 
based on Grenander’s studies, that a completely rigorous 
and general approach to the problem of optimum de- 
tection” of Gaussian signals in Gaussian noise has been 
formulated. Here, however, practically no results have 
been obtained in closed form, even in the somewhat 
unrealistic case of the complete statistical specification 
of noise and signal. 

The main purpose of this paper is to present functional 
forms for optimum detectors which, in general, are more 
explicit than those of Davis, and which, in some special 
cases, can be specified in terms of operations carried out 
on the received signal w(¢) by filters of completely de- 
termined characteristics. Such extension of Davis’ work, 
however, has been made possible only by making certain 
simplifying assumptions at the start, such as requiring 
complete specification of all statistical parameters and 


9 In order to define the ‘‘optimum”’ detector, we assume a com- 
munication system model in which the transmitter sends through 
a noisy channel a waveform z“)(¢) drawn at random from one of a 
finite number M of distinct ensembles, each ensemble having 
complete statistical specification. The generating ensemble is itself 
selected according to a symbol s;, k = 1, 2, ..., M, originating in 
an information source at the transmitter, with a prior? probability 
P,,. The detector observes the received wave w(t) over a time interval 
T and takes into account @ priort knowledge of the P; and the 
statistical parameters of the MM ensembles and the channel noise. 
Using this data, the detector yields information concerning which 
of the s, was encoded at the transmitter. We distinguish between 
two kinds of optimum detectors: 

1) If it is not required that the detector make a decision, or 
“suess,’”’ no information about the s; is lost, according to the philoso- 
phy of P. M. Woodward and I. L. Davies, ‘Information theory 
and inverse probability in telecommunications,’’ Proc. JHE, vol. 
99, part III, pp. 37-44; March, 1952, if the dector output consists of 
the a posteriori conditional probabilities Pp [sz/w(t)]. (The sub- 
script J’ indicates a limited observation interval; it will be dropped 
when the observation interval is allowed to extend over an infinite 
interval.) Accordingly, a detector that computes these probabilities 
will be defined as optimum in this case. 

2) If decision is required, an optimum detector is one in which 
the ill effects resulting from a ‘“‘wrong guess’”’ would be a minimum. 
For example, in the latter half of this paper we are concerned with 
the decision-detector that yields minimum probability of error, 
Siegert’s Ideal Observer. This is only one detector, however, in 
the broad class of optimum decision-detectors that arises from 
Decision Theory, a topic which has received extensive and re- 
warding investigation by Middleton and Van Meter, loc. c7t., and 
Peterson, Birdsall, and Fox, loc. cit. It is beyond the scope of this 
paper to review these many important contributions but we may 
remark that one most significant result of these studies is that, when 
available, the computed Pr [s,/w(t)] govern the decision in all such 
optimum decision-detectors. 
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assuming that the Gaussian channel noise has a ‘‘white”’ 
spectrum of unlimited extent. The Gaussian signals also 
assume a special form, as will be seen in the first part of 
the next section. 

Most of the analysis to follow was originally pursued 
with the intention of finding an optimum detector for the 
transmission of members of a finite set of known band- 
pass waveforms over a channel containing multiple 
“scatter-path’’? modes of propagation and additive white 
Gaussian noise.’ Since such scatter-modes perturb this 
transmission into a Gaussian signal, the same analysis 
serves both the scatter-multipath and random-signal 
detection problems, with the latter receiving the greater 
emphasis in this paper. Although the following work was 
thus done independently and in a different context from 
that of Davis, and employs sampling-point methods that 
may be somewhat less rigorous than the eigenfunction 
expansions used by Grenander and Davis, many of the 
results can be shown to agree with Davis’ general ex- 
pressions. 

This paper is divided into two parts. Part I presents 
general and special derivations of the optimum, prob- 
ability-computing detector (nondecision) from both the 
random-signal and scatter-multipath aspects. Part Il” 
will give results on the performance of the Ideal decision- 
making detector in some special cases, together with 


comparisons of the performances of other decision 
detectors. 
DETERMINATION OF THE OptimMuM, PROBABILITY- 


Computina Derrecror ror RANDOM TRANSMITTED 
SIGNALS AND FOR KNOWN SIGNALS TRANS- 
MITTED THROUGH ScATTER MULTIPATH 


Form of Random Transmitted Signal and Its Relation to 
Scatter Transmission 


We assume throughout this paper that the random 
transmitted signal z\"’(¢) is given by 


2° = LeU = aly — 1) 


where [y,({)*0°” (£ — r,)] denotes phase modulation of 
a stationary narrow-band Gaussian process y,(¢) by 
a (t — 7,): 


y(t)*O(t — t,) = Yor(€) sin [wot + O° (t — 7,)] 


+ Yep(t) Cos [wot + O° (t — t)] (2) 


10 Harlier developments of the work contained in this paper 
may be found in: 

R. Price, ‘Statistical Theory Applied to Communication Through 
Multipath Disturbances,” M.I.T. Res. Lab. of Electronics, Tech. 
Rep. 266; September 3, 1953. 

R. Price, ‘‘The detection of signals perturbed by scatter and 
noise,” IRE Trans., PGIT-4, pp. 163-170; September, 1954. 

R. Price, M.I.T. Lincoln Lab. Group Reps. (Not generally 
available. ) 

Related studies have been carried on by: 

W. L. Root and T. S. Pitcher, “Some remarks on statistical 
detection,’ IRE Trans., vol. IT-1, pp. 33-88; December, 1955. 

G. Turin, M.I.T. Lincoln Lab. Tech. Rep. (Not generally 
available. These results are to be published:in IRE Trans., PGIT.) 

11 To be published later in IRE Trans. PGIT. 
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Decembe 


using Rice’s expression” for narrow-band Gaussian noise 
Assuming for the remainder of this paper that the spectrun 
Y,(w) of y,(t) is symmetric about w., Ysp(t) and y,,(¢) ar 
independent low-pass Gaussian» processes of zero meal 
and autocorrelation function ¢,(7): 


I 


(7) Ysr( HY sr(t “| 7) > Yer) Yorit bs 7) 


I 


ie Y,(@) cos @ — wo)T dw. (3 


The amplitude and phase modulations, r“?(¢) an 
g(t), respectively, are members of a finite set of 20 
waveforms known a priorz to the receiver. In conjunction 
with the y,(é), these modulations form the M ensemble 
from which z‘"’(é) is drawn, as given in the definition o 
the optimum detector.” The 7, are arbitrary delay con 
stants. 

Iq. (1) has been chosen for the transmitted signal i 
order to allow the results of this analysis to extend to th 
scatter-multipath detection problem. While (1) is conse 
quently rather restricted in form, it covers a sufficienth 
broad class of random signals to be of interest. 

To demonstrate the appropriateness of (1) to thi 
scatter-multipath problem, let us suppose that a non 
random, narrow-band signal x“’(#) is transmitted ove 
scatter multipath 


a(t) = r (8) sin loot + 0 (O]. (4 


Assuming that the N-scatter paths have various delay: 
tT», and that the path fluctuations are slow compared t 
w, present theory’ indicates that the response of th 
pth path to an unmodulated carrier x)(¢) = sin wot is thi 
Gaussian process y,(¢) discussed in connection with (2 
and (3). Thus when x“? (#) is transmitted we obtain (1 
after scattering over all N paths. A diagram of the com 
plete communication system, illustrating the relationshi 
between the random-transmitted-signal system and th 
scatter-communication system, is shown in Fig. 
opposite. 


General Solution for the Optimum Detector™* 


The received signal w(t) = 2“(é) + n(é), where 2“ (¢ 
is given by (1) and n(t) is a white Gaussian noise o 
spectral density N,/2m defined on the basis of a single 
sided spectrum and an angular-frequency variable « 


| 


2S. O. Rice, ““Mathematical analysis of random noise,” Be 
Sys. Tech. J., vol. 23, pp. 282-332; July, 1944, and vol. 24, py 
46-156; January, 1945, See sec. 3.7. 
6S. O. Rice, “Statistical fluctuations of radio field strength fa 
i the horizon,” Proc. IRE, vol. 41, pp. 274-281; February 
_*H. Sherman, “The Detection and Phase Determination o 
Signals in Additive and Multiplicative Noise,” dissertation sub 
mitted in partial fulfillment of requirements for degree of docto 
of electrical engineering, Brooklyn Polytech. Inst., p. 63; June, 1955 
H. Sherman has carried out studies on the sampling point detectio 
of signals in correlated Gaussian additive and multiplicative nois 
that are the discrete analog of this analysis. As in the case of Davis 
work, the results are largely in open form, since they involve th 
inversions of high-order matrices. 
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\ RANDOM | 
PROCESSES | 

(SCATTER PATHS) Guinz 

a | 

| 

| 


1 
| 
! N 

(k) a (Kk) (k) 
| +. >| Ua dp 8,j—dp ae Hees dpt Ge, i—dy lysis Ar NB6;; 
R[s,Avit)| ae Veh cae, (8) 


as, /wir)| Meciy = Wei We; 


GAUSSIAN 


OPTIMUM 
DETECTOR 


(NON-DECISION) wi MMesng = SUWen 


8 [su /wit)} = (hk) (ic) (k) (hk) 
= 4 [aoe eae a Le, ie -dpUs, j—dp|Ppii 
p= : 


A_PRIOR/ 
INFORMATION 


where ¢,:; = $,1(@ — 7)/B] and 6,; is the Kronecker 


RANOOM-TRANSMITTED-SIGNAL SYSTEM’ j 6 function. 
TRANSMITTER CHANNEL RECEIVER The inverse, or a postervorn, probabilities Pr [s./w(t)] 
| ack | are found from analysis carried out in the opposite, or 


| 


al : p ee : 
TRANSMITTER © CHANNEL RECEIVER ! forward, direction ; 2.€., assuming the cause Sk and 


studying the effect w(¢). Using Bayes’ ae 


‘ig. 1—The random-transmitted signal and scatter-communication 


systems. (h) (ih) 
P,[s,/w()] = lim Pprl(wes), (wei) /(ei VF (rex (9) 
Bow 
We shall first assume that the noise is confined to a rec- ye P-prl(Wei), (Wei) /Grsi’); z . 


r=1 


tangular band of width 27B centered on w,, and later 
ullow B to increase without limit Ggnoring spreading into 
the negative-frequency region), so that the assumption of where parentheses denote the sequence of all sample 
white Gaussian noise is ultimately satisfied. values they enclose. The subscript 7’ on the “forward,” 
All the information in the band-pass signal w(/) is conditional, multivariate probability density pr [(w,;), 
contained in its sine and cosine components, w,(f) and  (w,,)/(x), («)] indicates restriction of the (w,;) and 
w(t), respectively. Sampling these components on a (w,,) sequences to sample values contained in the obser- 
common-time grid of interval 1/B, we obtain, using (1), vation interval. In order to obtain the “forward” prob- 
ability densities, we note, from (5), that since y,,(?), 


(infinitely dense sampling) 


eee ee a oR he Yer(t), ns(£) and n,(¢) are independent Gaussian processes, 
(5) the sequences (w,;) and (w,,;) share a joint Gaussian 
N N . . . 15 Oya (k) . 
Wie OE ce Re Sey ea ee te distribution, onion! upon «°’(t) being known. 
ci (ner s,i—dpJ cpt cr c,t—dpdJ spt ct Thus, from Rice, 


Drl(ws), (Wead/(a?), (e?)] 
=? Ki = 1 - - Mw W, it WD 5 5 ar MS? wWeiWei aia Mi) WeiWe; =— MO WeiWe 
= (2m) (ei ae | TTD ‘4 “ a | uM | (10) 
j=l 7=1 


where the subscript 7 denotes a sample value of the where M“” is the symmetric moment matrix 
sorresponding waveform at ¢ = (¢/B) + t, with the 


5 : : 5 (k) (k) (k) (k) (k) (k) 
»bservation interval commencing at /,. The functions Mesi1 Mssi2 °° * Messin Msc11 Mseiz “°° Mscin 
(k) (k) (k) (k) (k) (k) 
a (£) = r (£) cos 6‘ (t) (6) Ms 321 $322 ss2n e21 sc22 c2n 
k k 2 k 
eo? () =r) sin 0” 
2 s (k) s (k) (k) (k) (k) (k) (k) 
wre the sine and cosine components of x*"’ (4), respectively. U® a | Moen Momma 00 ' SD A PE OO. Te os (11) 
Likewise, 7,(f) and n,(¢) are the sine and cosine com- Gan en pen es on 
5) s i inte 
S Mesit Mesi2 5 Mesin Meci1 Mee12 Mecin 
ponents of n(¢), respectively. We may assume that the 
2 : (k) (k) (k) (k) (k) (k) 
r, are integral multiples of 1/B, and we have defined AD OPEL eR Ue LD LDN A OLED 
= 7B. (7) 
] f (ke) (k) (k) (k) (k) (k) 
Assuming that the y,(¢) processes (or paths) fluctuate me mie sein, men ean Oe 
independently of each other and of the noise, we find the f| 


second moments of w,; and w,; to be, for s, encoded at 


the transmitter, 
(k) =a ae 16 H. Cramer, ‘““Mathematical Methods of Statistics,’ Princeton 
MNssiz = WsiWs; Univ. Press, Princeton, ING doy > Bilas ay, 
165, O. Rice, “Mathematical analysis of random noise, 


(k) 
MNecig = WeiWe; sec. 2.9. 


”) 


OpeuCite. 
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1) ise Pam Bhi eee) Be and M\. are the cofactors of m{*,,, 
m® . m®., and m\®.,, respectively, in M“ and | M“™ | 
is the determinant of WZ‘. There are n = BT sampling 
points in the observation al of length 7. 
Substituting for the cofactors 
ve b Mae 
ha = 6; ir NoB Typ % M®] | ; 
ke M vei 
hecis = 945 — NoBiap@y 
bases) (12) 
(k) : Mia 
we — N ob | M® | ; 
Us) 5 Mess 
he. = —N Ba 


we find, by straightforward matrix algebra, that h{?, 
and h;", are also specified by the set of simultaneous 
equation pairs 


n 
(k) (k) (k) (k) (k) 
S [mserhse siq =F Marlee cs | = Ms 514 rz N B56; 
i= Pj elen 
n 
(k) (k) (k) (k) (k) 
De [mes 1iNes sij == MeeriMesis | = Mesti (13) 


t=1 


with hi”, 


ecty 


and hi"... given by a similar set. Using (8), it 


may be shown without difficulty, from these sets of 
equations, 
(k) (k) (k) (k) (k) 
Riess = hecis = i a h ssji — =h.3; = hii; | (14) 
(k) CS) Owe AO Se ACD) 
Neece = —h, SIC/0¢) ae —hesi: = lier = hess 


Substituting (10), (12), and (14) in (9), we find 
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we have 
Mas. | 2n ( sl 
a = 1 19 
(NB) ~ I] a NB \ 
with (17) modified appropriately for the new ),/’. 
Carrying out the limiting process of (15), the sum 


mations of (13), (15), and (17) become integrals, witl 
limits 4) and f, = ft) + T, and we have 


peestal = 


Rs Pils 
r=1 


P7[s, /w(t)] (20 


where 
Ln = Fr? exp | ay aff [w;(éw,(7) 

+ wadwl y(t — 7,8 

— [w,()w.(2) — mre S(t = 7, 0} drat] Qt 
h\® (£ — 7, t) and hs” (t — 7, t) satisfy, from (8) and (13) 
[Pe 9 + Neale = DIP = 70 


= $1"(c, t) 
Gg th S th 


_ &5'"(o, ths (t = t2b)) dr 


to (22 


[ 2, OE = 7,0 + [2% C, 2) 


l| 


+ Noile — z)]he(t — 7, 2} dr 


where 6(¢) is the Dirac 6 function. We have made tht 
substitutions 


b;'""(c, 0) 


M® =—1/2 n n 
Qn Pe exp a ey Ss [(w.iWs; = WeiWes) Mii; = (WsiWe; a, WeiWs;) ois | 
P yi ns (NB) 2N.B v=) Soak 
r|s./w(t)] = ee M | uo =1/2 n . (15 
aes sl : pas W.: — W..w. ne 
a [et P, exp tots n 2» | (w.iWes + WeiWes)hit; — (Weise; m.)hS0]) 
In order to find the limiting behavior of the | M\” |/ Pe ie = (k) 
(N,B)*", we note that, from the theory of determinants, Ue Tet ee mobo — 7) (23 
N 
pM | = TTX? (16) B:°(0, 7) = Do Xsnda(o — 2) 
t=1 
and 
where the { are the eigenvalues of the homogeneous <i a ik 
system of equations corresponding to (11): Miho) Oe tee = Tie (eae | 
; + 2. (¢ — 7,)2.° (7 — 
SN ioe ea ane " Meu aig (24 
ea GeV les Nscij 2r7 nl 1li : Xe (Ge r) = a” (¢ = tas? (7 : T 2) 
Mem Ree see IG) 
= [m (k) ye +m (k) Gay = me i) (17) = 2 OG = Tp)t6 (4 = To) 
Ss From (17) and (19), 
and the pair of sequences y{"), ws7?, with 1 = 1, 2, ---, 2n, P &) 
is the eigenvector corresponding to the anlar Ne as a Llane (25 


Letting 


Xn = Ant + NoB (18) 


where the \;" are the eigenvalues a the pair of integra 


equations, 


956 
el 


| = V() 
to sss o = t, 


[B?(o, NYPD) + Be, DY® 


| (26) 
&,"(c, 1) bar (7)] dr 


= War (c) J 
(k) (k) 


nd the pair of functions y{", wy? gives the eigenfunction 
orresponding to \;”” 

Eqs. (20) through (26) constitute the solution for the 
ptimum probability-computing detector in the general 
ase of multiprocess random transmitted signals, or for 
satter multipath. The simultaneous integral (22) and 
26) can be solved in special cases to be discussed in the 
»mainder of Part I. 

We may note three interesting results of a general 
ature that are obtainable from (22). First, if either 


[= #20, VPC) + 


Gi biaet,)— OF at, Re v=) 


iG =e Ts) =0= De Gh a Tes 


for all p (27) 


and hs" (¢ — 7, ¢) vanish. Thus, if 
1e waveform x"’(¢) vanishes over an interval long 
rough that only the white Gaussian noise is received 
uring a certain period, the receiver may ignore that 
ariod with no loss of information about s,. This result 
yrees with intuition. 

A second and more subtle result is that the double 
tegration of (21) can be performed by physically- 
alizable filters, in the sense that hy"? (¢ — 7, é) and hy” 
-— 1, t) may be allowed to vanish for 7 > ¢. For, from 


4) 
hy (7 — t, 7) | 
ha a: — t, 7) 
hus, the integrand of (21) is symmetric in ¢ and r. Since 
r any symmetric function K(é, 7) = K(z, 4), 


[ [ wGndra-2f | abn aieieuk 


id the upper limit on the inner integral in (21) may be 
1anged from ¢, to ¢ [provided that the factor of 2 in (29) 
allowed for], thus permitting hj” (¢ — 7, #) and hy” 
— r,t) to vanish for 7 > ¢. 

Finally, it may easily be shown that the form of (21) is 
variant to a purely rotational transformation of w,(é) 
id w,(é), a result justified intuitively by the complete 
ndomness of initial phase in 2" (f). 


r both, hy” ( — 7, bt) 


) 


l| 


eG — 7, t) 
Ot cee; t) 


(28) 


(29) 


pecial Cases of Multiprocess Signals (or Scatter Multipath) 


Amplitude-Modulated x(t): If x(t) is a purely 
nplitude-modulated carrier, ®;° (, r) vanishes, so that 
2) and (26) become, respectively, 
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tr eS Tr) Oyo — T) 


+ N,d(o — jure — si ob )h OT, 


eae os (30) 
see : iG = tyr (t — Toro — 2] 


‘i 
_ 


to < 0, t < ty 
=on, = 0 J 


tyr (4 — 1,)¢,(0 — n [vic dr 


ety (ola) lg ag 5 a ee 


with each );"” now counted twice in (25). If, in addition, 
r (¢ — 7,) = y, a constant, for all 4 < ¢ < 4, and for 
all p, and if the ¢,(7) all have rational algebraic Fourier 
transforms Y,(w), (80) and (31) can in principle be 
solved by the methods of Zadeh and Ragazzini,*'” 
Middleton,’* ‘Youla,’’’’° Slepian,* and Miller and 
Zadeh.”” 

Very Slowly Changing Processes or Scatter: If all the 
y,(t) processes have such narrow spectra Y,(w) that they 
are nearly sinusoidal during the observation interval, 
with negligible amplitude or phase modulation, we may 
set ¢,(7) = ¢,, a constant, for | 7 | < T. Eq. (22) may 
then be solved by letting 


N N 
ht — 2,0) = D7 Dole? Kans Oe ae) 
p=1)-g=1 


N N 


hig? (bo iggit) = lee ee a 


p=1) ‘\g=1 


(82) 
bp; Xana tsb) 
Substituting (82) in (22), we find that both integral 


equations of (22) are satisfied if 


N 


lace Dee be, Ele soe Na 
= lax G2 < Nes) 
[—a,, Ee) + bee D3, | 80 
pH=l 
where 
ty 
DY =a [XM 0 dt + 4,,/0, 
0 t 
: (34) 
fee | ent 
to 


““An extension of Wiener’s 


iw. A. Zadeh and J. R. Ragazzini, 
vol. 21, pp. 645-655; July, 


theory of prediction,’ J. Appl. Phys., 
1950. 

18D. Middleton, M.I.T. Lincoln Lab. Tech. Rep. (Not generally 
available.) 

19]. Youla, “The use of the method of maximum Rerraae 
in estimating continuous- modulated intelligence which has been 
corrupted by noise,” IRE Trawns., PGIT-3, pp. 90-105; March, 


1954. 
AD 1D). Youla, ° ‘A Finite-Time Homogeneous Weiner-Hopf Integral 
Equation,” Res. Rep. R-445-55, PIB-376, Microwave Res. Inst., 


Polytechnic Inst. of Brooklyn, Brooklyn, N. Y., October 3, 1955. 
BID). Slepian, * ‘Estimation of signal parameters in the presence 
of noise,’ IRE Trans., PGIT-3, pp. 68-89; March, 1954. 
21K.’§, Miller and L. A. Zadeh, “Solution of an integr al equation 
occurring in the theories of prediction and detection,’ IRE Trans., 
vol. IT-2, pp. 72-75; June, 1956. 
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Solving (33) for the a}"?, infinite series. By successive approximations we fin 
ui from (22), 
a? = ———2r (35) : 
i na Ty (k) ¢ 
Fo Mie AG — 7,6) = aa @)"(7, ) — as 
1 Te oad N i ) N2 
0 0 


where WM“ is the symmetric matrix be 
Moe [De] gall (36) fC, B,D — BC, oO", OI 
[ 


—FESP | [DP] + terms of order N>° and smalle 


| M° | is the determinant of M“ and M;? is the cofactor __,,, igen 1 
of D“® in (36). Substitution of (82) in (21) yields terms het) No Eig Oo N? 
involving the a,” only, so that it is not necessary to find é 
ae Poe ee Bens eeict il [Bi (7, @) B29, ) + OP(r, a) BCG, HJ 
o find the F’, of (21), we le to 
N eS a + terms of order Nj° and smaller. (4 
(k) ae A®e® (4 — NeoPa Ge 
var (7) 2 Gti ita) ( ) were F, may be approximated by 
N © © 
PG BG yt eal log, F, = ap SM” — sae OP 
eau () 7=0 0 l=1 
Both integral equations of (26) are then satisfied if A}? + terms of order N5* and smaller. (4 


and By” satisfy 
Examination of (26) shows that it is really a sing 


N 
pet Ay De Bene | homogeneous integral equation, despite the dual forr 
a=1 Hence sums of the form 
= (1+ /No) A " 
Li pe eG 38 ; ; 
N sath tr ( Deana le gl er a 
Yyonas Mesh dey erm cle gd) 2) ce 
a=1 . may be found” by successive quadratures of the kern 
=fic+ i? /N | B® | of the single equation obtainable from (26). We find, f 
Since the quantities [1 + \j"/N ] are the eigenvalues of ie 
the system of (88), and there are therefore 2N of them, Sk) x i es 5 
their product is the value of the determinant correspond- > Vales > $,(0) i bP (a = ay) h, do (4 
ing to the system (38). Thus iy al ee 
oN w DA?) = on / (O(c, DP + [8(c, DP} dr d 
F, = [] [1 + ?/No] = | 98 | PDE A(S ON tes Chie é 
l=1 p= ‘ 


since the systems (33) and (38) are similar. Substituting Substituting (43) through (46) in (21), 


ing (32), (35), and (39) in (21), we have, finally, 1.» Rane heeees 
log. In © 53 x if Hi twiecol we T)b(t = 


N il 
Te we | I». | ae [aoe 


jhe 
N WN (k) Wa eS Wan OLA T) 
exp {i 3 3 A ah a® + Be By Go) Ne et 


= ert aS, | 1 
where aig N, > Woe (A) Usz at, » | 
Cee, Slee ety Rha ad 
AP = af WO ai + wool wey Ne(t-)- + SwenuvHee 
(41) No q=1 
(‘Hae= pis 4 (k) ‘(oe 
Be =a |] WO at —5- DS wP@uLe, »|} Beek 
0 q=1 
and we + 
asd pes (k) nate 2 
® i) = g(t — 7,)w,.(f) + af (t — oy (42) N, 2, 4,(0) 4 [r'(¢ — 7,)P do 
Oe oe rete (Pee Oe) 


This result appears in a different form in some recent 
work of Turin’s.”° 
’ i Oe F , . °5R. Courant and D. Hilbert, ‘Methods of Math ti 
Infimte Series Solution for Large No: For sufficiently Physics,” Interscience Publishers, New York, N. Y., sa 138: 195 
large No, (22) and (25) may be solved by developing [Integrate (58) over s and #.] ' 
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to a second-order approximation, where 


URE, ) = [ XIMe, able — Db. — 1) do 
(48) 


UIE 2) = | XR, ooo — Od.le — 1) do 


and Wy? (t) and W$})(#) are given by (42). For very large 
No. we have, from (47), the first-order approximation, 


N 
log. Ly & Ss ion i i. (Wi? OW? T) 


p=l1 


+ Ws, (QWs (7) ]b,(t — 7) dr dt 


ty 
_ a i rr’ (e — 7,)]) ia}. 
Thus, at large N, the optimum detector analyzes the 
contribution of each path separately, as though it were 
the only path present, and it is only when N, decreases 
that interaction terms such as those of order No* in (47) 
become significant. 

In Fig. 2, one possible form is presented for an analog 


(49) 
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Fig. 2—First- order approximation to kth computing element of 
optimum receiver. 


computer to calculate (49). We assume that the signal 
z(t) has, as before, the carrier frequency wo, but that 
the stored reference waveform 2x\")() differs from the 
transmitter «”(t) in having a carrier frequency w, = 
Wo + Wa, 


a(t) = 2 () sin w,t + 2() cosa,t. (50) 


Assuming that w, is small compared to w, but large 
compared to 1/7 and to the highest-frequency com- 
ponents of 25") (£) and 2; (¢), it may be verified that 


7 2 (t — 7,)w(t) dé 
f a(t — 7,)w(r)o,(t — 7) coswi(t — 7) dr 


5 / ; it i [Wi OWE (2) 


+ Way (QWap (r)lo(t — 7) dr dé (51) 
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with the substitution of the variable limit ¢ as discussed 
in connection with (29). 


Optimum Detection for a Single-Process Random Signal 
(or a Single Scatter Path) 


When there is only one random process or path present, 
N = 1, and (21) and (22) may be simplified by letting 


heb, ) =] X2G Oa = Ot. (52) 
ha (b=, t= Xo, De ee) 
Eq. (21) then becomes 


Lee tir exp vier / [Wir OWi? (x) 


Wer (OWsP (dhe — 7, 8) dr dt (53) 
and both equations of (22) yield 
[eo = n)Pee — 9 

+ Nilo — r)}h\(t — 7, t) dr 

= ¢(¢ — 7); CR EE Uh (54) 
Similarly, letting 
ae (55) 
PG) = 0 — nr) WPO) 
both equations of (26) yield 
[°C = Poe — DPC ar 
Dvr); tSe5t (66) 


and each of the \)” eigenvalues of (56) should be counted 
twice in (25). 

In general, the solutions of (54) and (56) are not known. 
For particular r“?(£) and ¢,(r), however, they may be 
solved, and these special cases will be considered in the 
next section. 

We may, however, draw one very interesting general 
result from (53) and (54). If we assume that s, is to be 
encoded at the transmitter, and in (42) substitute for 
w,(t) and w,(t) the expressions corresponding to the 
continuous form of (5), we obtain 


WROD E — 1) = 1 O(t — yal) + oe 57) 
(P(t — 1) = rt — r)yerlt) + ni) 


where n/(t) and n{(¢) are independent white Gaussian 
noises statistically identical to n,(¢) and n,(¢). Now if, 
somehow, y,:(¢) and y,.(t) were completely known to the 
receiver a priori, for fy < ¢ < t, rather than just their 
statistics, the optimum detector would compute, accord- 
ing to a simple extension of the theory of Woodward and 
Davies,” for the single channel disturbed solely by white 
Gaussian noise, 


24P. M. Woodward, “Probability and Information Theory, with 
Applications to Radar, ” McGraw-Hill Book Co., Inc., New York, 
INTG M658 3, 
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been solved explicitly for particular forms of r“(é) and 
¢,(r), and these solutions are now presented.”° 
Case I. Solution for Constant r“(d) and Exponential 


¢,(7): For 
(61) 


(62) 


r(t{ — 7.) = y, a constant 
$,(r) = Be oe 


Eq. (54) can be solved by the method of Zadeh and 
Ragazzini,'” yielding 


¢=— 


ol’ T | os (—2acT) cosh [ac(t — 7)] + al’ a ) exp (—acT’) cosh [ac(t + 7 — to — t)] 


Comparing the w(t)-dependent portions of (53) and (59), 
we may draw the conjecture that the integrals 


Thy = / We Gh ee Feds 

J to (60) 
es / W(Dr(t — 7, b) dr 

Jibs J 


yield estimates of y,,(¢) and y,.(é), respectively, for the 
actual system considered. This conjecture proves to be 
correct, and, in fact, the integrals (60) are both maximum- 
likelihood and least-mean-square-error estimates.’ Re- 
ferring to Youla’s work’’ on the maximum-likelihood 
estimation of a modulated Gaussian waveform in noise, 
we see that Youla’s (1), (25), and (26) are identical to 
(57), (54), and (60) of this paper, respectively, with the 
following correspondences: 


Youla | This Paper Youla | This Paper 
Mit) | r®(t — 74) Wt, r) | RM(t — 7, )r(r — 74) 
a(t) Ysilt) or Yerl(t) (f= UG to; ti 
n(t) CO) Ove ie (CH) a*(t) I,(t) or Iy(t) 
eé,(t) [Wis (t) /r®(t =) Ralt, T) gill =a) 
|\or Wa (t)/r™(t — 71) | Balt, 7) Nod(t — 7) 


The same general result has also been obtained’? using 
sampling-point analysis rather than the eigenfunction 
expansions employed by Youla. 


Special Cases of Complete Solution for Single Process or 
Scatter Path 


Under “Special Cases of Multiprocess Signals (or Scatter 
Multipath),’’ we discussed three special cases in which 
the multiprocess or multipath integral equations (22) and 
(26) can be solved, at least in principle. These solutions 
exist also, of course, for the single-process or single-path 
equations (54) and (56). In addition, (54) and (56) have 


25 The suggestion of using a correlator in conjunction with an 
estimator for a multipath receiver first appeared in studies whose 
results are presented in Root and Pitcher, op. cit. 


e—i1\ 
1 - (4) 


exp (—2acT) 


+ exp [—ac | ¢t— 7 |] (63) 
where 
e= 4/1 + = (64) 
- n 
and 
Up ee Nya/(By’). (65) 


Youla’’ has given (63) in his (39), in slightly different 
notation and for y = 1. The dependence of the shape of 
h (¢ — 7, f) on 7 is illustrated in Fig: 3 (opposite). 

The solution of (56) for 7’ (t) and ¢,(r) given by (61) 
and (62), respectively, has been found by Kac and Siegert?’ 
and by Slepian.*’ The eigenvalues are 


(66) 


where the R, are the positive roots of the transcendental 
equation 


26 A. J. F. Siegert, ‘“A Systematic Approach to a» Class of Prob- 
lems in the Theory of Noise and Other Random Phenomena. II. 
Examples,’’ Rep. P-730, The RAND Corp., Santa Monica, Calif., 
September, 1955. ‘“‘Passage of stationary processes through linear 
and non-linear devices,’ IRE Trans., vol. PGIT-3, pp. 4-25; 
March, 1954. 

D. Middleton has called the author’s attention to these two 
pertinent papers. Herein, closed forms for products such as (25.) 
are obtained without having to solve (56.). Thus for Case I we find, 


from (2.32) of the RAND report, 
WD — ==] 1 ° 
— Z h ee 
a e ’ (cosh ge + € E + x sinh ge) 


with c and No given by (64.) and (70.), respectively. For Case II, 
from (2.48) of the RAND report, 


ye - CPL OR Ade KNOG 


with « = exp (—yT) andd = V 4y/No. The J’s and K’s are modi- 
fied Bessel functions and » and No are given by (73.) and (82.), 
respectively. In both cases the \“); are counted twice in (25.), as 
stated following (56.). 

In general, it appears that since h“)(¢ — +, t) is the Volterra 
Reciprocal Function yielding the Fredholm Determinant F4/?;,, a 
complete solution of the single-process or single-path case need deal 
only with (54.), rather than both (54.) and (56.). 

27M. Kae and A. J. F. Siegert, “‘On the theory of noise in radio 
receivers with square law detectors,” J. Appl. Phys., vol. 18, pp. 
383-397; April, 1947. 
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Fig. 3—The optimum filter function h“ (¢ — 7, t) as a function of 
the signal-to-noise ratio. 
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The values of e(N,) are presented in Table I and Fig. 4 
(below) but details of the computation, such as securing of 
good convergence for (25), will be found elsewhere. ’? - 


Solution for Exponential r“ (t) and Exponential $i (7); 
For 
r (t= 71) = yer» pw pokitive (71) 
and ¢,(7) as in (62), (54), and (56) can be solved by the 
methods of Youla'’ and Juncosa.”* The integral equations 
are converted into second-order differential equations 
using Youla’s method, and Juncosa’s transformation is 
then applied to yield Bessel’s form. The integral equations 
are thus solved by fitting Bessel functions to appropriate 
boundary conditions. We find 


| - LA Say ay Sm re (Sa EO, EoD) ain PG AK, Geass) + EPA Sol CC WAG uKGG 3) + KG G) | 
Ty -1(80)K +4182) ; 1 GVO LAG |) 


TAKA) le ee 


“fe (72) 
IAG Ee ae (are tie 
TABLE I 
Tue Bras e(No) 
No al =10 al= 04 al = 150 al = 20 aT = 4.0 aT = 40 al’ = 400 
0.01 0 .046151205 0. 103626 0.145791 0.192107 0.900 
0.02 0 .078636513 0. 148721 0. 203217 0. 262492 0.339988 0.699 0.9444 
0.05 0. 15222612 0.236536 0.308192 0.384795 0.480307 0.82993 0.97620 
0.09 0. 2244710 0.313893 0.394407 0.479353 0.580928 
0.10 0 .23978953 0.329641 0.411328 0.497296 0.59922 0.901 0.987817 
OnE 0. 25427974 0.344375 0.426980 0.513714 0.615727 
0.20 0. 35835189 0.446570 0.531190 0.618650 0.715968 0.9449 0. 9938344 
0.50 0. 54930615 0. 622932 0.696052 0.769268 0.843362 0.97646 
1 0.69314718 0.748831 0.804304 0.858203 0.90923 
2.0 0. 81093022 0.848083 0.884701 0.919345 0.95048 
5.0 0.9116078 0.930115 0.948112 0.964696 
10.0 0.9531018 0.963169 0.972900 0.981770 0.98940 
20.0 0.9758032 0.981062 0.986148 0.990712 0.99484 
OR, 20.0 T ir 
tan gR Steet 4 
ee (67) a ASSUMPTIONS: 
BO = MM\pey 
g al J Maebor™ 
: , : 2.0 
Using (66) in (25), with each \;” counted twice, as ex- g= aT 
é ; nies Ke Ae 2 =) 
plained in connection with (56), values of the bias < Neaeatey a) 
—— as O505— 
cv y= losni (68) l2 
== 0.20 - 
have been computed for various VN, and g, where ae 
No = N/E; 0.05 |- 
ae ty (69) 
B, = 40) | FC — n)f ae 002 
to 0.01 


E, is the average signal energy received during the 
observation interval when s, is encoded. For the special 
case under consideration, 


ass No 


eae eo Te 70 
Ga (70) 


Fig. 4—The bias constant e(No) as a function of the signal-to- 
noise ratio. 


* M. L. Juncosa, ‘An integral equation related to Bessel func- 
tions,” Duke Math. J., vol. 12, pp. 465-471; 1945. 
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where 
v= a/p (73) 
= 
See 2, 
o. = pe" ie 74 
rea) (74) 
? =t — i; 7 =.17 — tb (75) 


and J,(x) and K,(x) are modified Bessel functions of the 
first and second kinds, respectively, of order v. As np — 0, 
(72) must approach (63) as a limit. In Fig. 5 (next page), 
(63) and (72) are compared under similar conditions to 
illustrate the effect of the form of r’(¢) on the shape of 
h (t — r,t). Asn > &, the shape of h (¢ — 7, f) con- 
verges to that of ¢,(f — 7) irrespective of the form of 
rv (¢), in agreement with the result (49) for large No. 
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Fig. 5—Effect on the optimum filter function h™ (¢ — 7, #) of 
varying r‘*)(t). 


In connection with the use of h“’ (¢ — +, 2) as a lest- 
mean-square-error estimator of y,,(¢) or y.i(d), as discussed 
in “Optimum Detection for a Single Process Random 
Signal (or a Single-Scatter Path), it is of interest to 
examine this minimum error /(¢). Using (54), (57), and 
(60), we find 


lI 


ya(t) — 2()2 
Woh (0,0) 


BO-= [yd — EOF 


l| 


(76) 


a result valid for all r“’(¢) and ¢,(r). In Fig. 6, the E(0) 
are plotted for the two cases considered in this section, 
under similar conditions. For the case of constant r“”(é), 
there is a rise in error near the end points of the obser- 
vation interval, produced by the reduction of the 
observable neighborhood in which y;,(¢) is approximately 
constant. For exponentially decaying r“(é), this effect 
is present at the start of the observation interval, but as 
r’(t) decays E(t) naturally increases. 

The solution of (56) for exponential r“(#) and ¢,(r) 
may be found by the method used in arriving at (72). 
The eigenvalues are 


(77) 


IRE TRANSACTIONS ON INFORMATION THEORY 


December 


where the F, are the positive roots of 


Teun cep eather) = Newite | by peas) = 0. (78) 


~ J,(x) and N,(x) are Bessel functions of the first and 


second kinds, respectively, of order v. For » = 1/2 and 
3/2, (78) becomes a trigonometric equation like (67). 
When 7 = o, (78) reduces to Kac and Siegert’s solution, 

ea tea) = 0, (79) 
which assumes simple trigonometric forms for » = 1/2, 
3/2, 5/2, and 7/2. For vy = 1/2, and 3/2, the R, are simply 


T 
ee 9 (21 — 1); pele fo (80) 
(ml; yp = 3/2 


and the e(N,) of (68) may, from (25), be found in closed 


form:” 


eat. 
Nev Ol = 1 


log. [ E + 


No 
l=1 
= N, log, cosh V 2/No; = 1/2 
=\— < 6 = ee (81) 
emi | 
ere I 2 Nor l’ 
=e yey ENTE, = 3/2 
V6/N 
where, from (69), 
mee ROUNG 
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Fig. 6—Mean-square-estimation error H(é) as a function of r‘(é). 


CONCLUSION 


Although complete solution of the integral equations 
(22) has been possible only in a few special cases, it 
appears that the optimum filter characteristics do not 


* P. Franklin, ‘Treatise on Advanced Calculus,” John Wiley & 
Sons, Inc., New York, N. Y., pp. 463, 473 (48); 1940. 


depend critically on the parameters of the systems under 


consideration. For example, Figs. 5 and 3, respectively, 
indicate that the shape of h“” (¢ — 7, #) does not change 


radically with variations in either r“’(é) or the strength 
of the noise. The ¢,(7) could probably assume a wide 
variety of forms other than simple exponential, without 
markedly altering the general behavior of h“ (¢ — 7, 2). 


: Similar observations would apply to bias constants e(N>). 


It is also likely that optimum receiver realizations 
arrived at by using assumption of Gaussian statistics will 
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perform well for a much broader class of random processes. 

In the case of the single-scatter-path channel, a 
functional form for the optimum detector has been found 
exactly, although the exact filter functions and biases to 
employ in the various elements have been established 
only in two special cases. The general result that the 
optimum receiver consists of an estimator-correlator 
combination is of particular significance, and even appears 
to apply to some extent for multipath channels when the 
noise is appreciable, as shown by Fig. 2. 


A Coincidence Procedure for Signal Detection’ 


MISCHA SCHWARTZt 


Summary—A coincidence method of detecting signal in the 


_ presence of noise is compared to the statistically optimum Neyman- 


Pearson procedure utilizing signal integration and threshold 
detection. In this coincidence procedure a specified number of the 
fixed group of successive pulses are required to exceed a voltage 
threshold level. The analysis is carried out for the case of constant- 
amplitude signals only and the results indicate that the best possible 
coincidence method requires about 1.4 db more power than the 
Neyman-Pearson method. 


INTRODUCTION 


HE NEYMAN-PEARSON statistical theory of 
aR testing hypotheses has recently been applied to 

the problem of the detection of pulsed signals in 
the presence of noise.’ * 

The detection procedure developed on the basis of the 
Neyman-Pearson theory consists of requiring the sum of 
K detected voltage pulses to exceed a specified threshold 
level. This level is determined by the allowable probability, 
P..»x, that noise in the absence of signal might exceed the 
level and be erroneously detected as a signal. The prob- 
ability, P..,, that signal plus noise will exceed the level 
(after summing or integration) and be identified as signal 
is then dependent upon signal-to-noise ratio. 

The Neyman-Pearson threshold detection scheme is 
presumably optimum in the sense that it requires minimum 
signal-to-noise ratio for given P,,, and P,,,. The obvious 
question arises, however, that if this procedure is optimum 
how much better is it than other possible procedures? Do 
other procedures exist which do not perhaps do the 


* Manuscript received by the PGIT, March 28, 1956. Abstracted 
from Chap. 4 of an unpublished Ph.D. dissertation, Harvard Univ., 
Cambridge, Mass., May 1, 1951. The work was done under a scholar- 
ship grant of the Sperry Gyroscope Co., Great Neck, N. Y. 

+ Brooklyn Polytech. Inst., Brooklyn, N.Y. 

1M. Schwartz, “A Statistical Approach to the Automatic 
Search Problem,” Ph.D. dissertation, Harvard Univ., 1951. 

2H. V. Hance, “The Optimization and Analysis of Systems for 
the Detection of Pulsed Signals in Random Noise,” Sc.D. disser- 
tation, Mass. Inst. Tech., Cambridge, Mass., 1951. ; 

3D. L. Drukey, “Optimum Techniques for Detecting Pulse 
Signals in Noise,’”’ presented at the IRE Natl. Conv., New York, 
N. Y., March 4, 1952. 


required job as efficiently, but which may be warranted 
anyway because of simplicity and economy of equipment 
required? 

A general answer to these basic questions will not be 
attempted here. Instead an alternative statistical de- 
tection procedure will be analyzed and compared to the 
Neyman-Pearson optimum approach. This can then 
serve as a first step in the process of finding answers to 
the above questions. 

_ The procedure to be discussed is very simple in concept 

and consists basically of a coincidence threshold scheme 
for the detection of signals in the presence of noise. The 
threshold level of the previous optimum integration 
method is retained, but now a specified number of the 
individual (nonintegrated) K pulses are required to 
exceed the level to be called a signal. It is felt that equip- 
ment-wise, this coincidence procedure may be at least as 
amenable to development as the optimum integration 
scheme, a counter (or counters) with set and reset gates 
being a possibility for a practical detection system of 
this nature. The counter would be adjusted to indicate 
signal detection if n pulses of the group of K (n < K) 
succeeded in passing the threshold level and registering 
in the counter. In a radar system a different counter 
would be required for each range element within the 
range gate of the system, or else provision would have 
to be made to store the information contained in the 
different range elements, implying a storage tube as in 
the integration case. At the end of a time equal to K 
repetition periods the counters would be reset to zero 
and the procedure again repeated. This scheme compares 
with the storage tube and associated circuitry, or other 
devices required for signal integration. 

Since such a coincidence detection method might be 
fairly readily applied practically, it is of interest to 
determine analytically how efficiently this procedure 
functions for the purpose of signal detection. The follow- 
ing analysis is aimed at answering this question. 
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To facilitate matters it is assumed that signals when 
received are all of the same constant amplitude.’ The case 
of fluctuating or unequal signals will not be considered in 
the coincidence analysis in this paper. 


COINCIDENCE ANALYSIS 


With the assumption of constant-amplitude signals 
each pulse among K successive voltage pulses will have 
the same probability, p, of exceeding a specified threshold 
level. When the pulse is that due to noise, p is exactly 
P.», (the probability of one noise pulse exceeding the 
level) of the Neyman-Pearson threshold scheme. Similarly, 
for signal plus noise, p is identically P,,,. The probability 
P that exactly n out of K pulses will exceed the level is 
then given by the binomial distribution as 


P= 207 = pee (1) 


where 


K! 


xCn = niiK — n)! 


represents the number of ways in which the required n 
successes out of K trials can occur, and (1 — p) is the 
probability of failure (7.e., the voltage fails to exceed the 
threshold level). 

The presence of a signal will, however, also be indicated 
in cases where more than n voltages of the K received 
would have exceeded the level had the counter been 
allowed to count beyond the first » pulses. The prob- 
ability, P,, that a signal will be detected by this co- 
incidence procedure is thus given by the cumulative 
binomial distribution, 

K 


Lose = dye RO me sl ad Rare 


man 


(2) 


the sum of the mutually exclusive probabilities that n 
or (n + 1) or ««*xK signal-plus-noise pulses will each 
exceed the level. 


Similarly, P,, the probability that noise will be 
erroneously called a signal is given by 
K 
ge = ~~ ppd gael — Jey Me, (3) 


(P,, and P,,, are the single-voltage probabilities defined 
previously.) 

Eqs. (2) and (8) are the basic equations of the co- 
incidence procedure. With n and K fixed, and P,, specified, 
P.,, is determined from (3). 

In this case P,,, is simply the probability that a single 
noise sample, after envelope detection, will exceed a 
fixed threshold level. With Gaussian noise assumed at the 
input to a square-law detector, P,,, is given by’ 


IPAs — Gg (4) 
with 
x ey 
q 2a Wo 
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b—the threshold level in volts, 
a—a detector constant, and 
Yo—the mean-squared noise voltage. 


Setting P, at a desired value (as operationally de- 
termined, e.g., 90 or 99 per cent,) the required value for 
P.,, is then found from (2). P,, is here the probability that 
a single signal-plus-noise sample will exceed the threshold 
level. 

The equation for P,, was derived by Rice in his classic 
paper.* Combining (4) and Rice’s results, P,,, P,», and 
mean power signal-to-noise ratio, s’, can be shown to be 
related by: 


[Pa = if Dye I 2sy) dy: 
NZ loe Pnb 


(5) 


I,(x) is the modified Bessel function of first kind and 
zero order. (Note that this equation is independent of the | 
type of envelope detector used.) | 

Eq. (5) has been plotted in Fig. 1, using the curves — 
available in Rice’s article. 
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Since P, and P, here are analogous to P,,, and P,,,, 
respectively, in the signal integration procedures of 
detection, the coincidence and integration procedures 
may be compared, for a given K, by determining the 
input signal-to-noise ratio required for each scheme (with 
Pea ie eee 


Optimum NUMBER OF COINCIDENCE CouNTS 
Thus far there has been no discussion as to the particular 
choice for the parameter n. For a given value of K, and 


#8. O. Rice, “Mathematical analysis of random noise,’’ Bell 
Sys. Tech. J., vol. 24, pp. 102-103; January, 1945. 


| 
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with P,, and P, specified, that n (or range of values of 1) 
should be chosen which results in a minimum signal-to- 
‘noise ratio required per pulse. That such an optimum 
value for n does exist can be shown qualitatively. 


Thus, for n small the very high threshold levels needed 


y . . . 

to prevent noise from being detected as signal force large 
signal powers to be needed, while for n approaching K 
the requirement that n signal plus noise voltages exceed 


the level is stringent enough to require large signal 


powers again. 


It has not been found possible to find the optimum n 


| analytically. Instead a point-by-point graphical procedure 
has been followed. For this purpose the National Bureau 
of Standards’ tables of the cumulative binomial distri- 
bution have been used,” these tables giving P,, directly 
“when P,, K and n in (2) are specified. These tables cover 
| part of the range of values of P,, (10-° > P,, > 107"), the 


remaining range required, 10°‘ to 10°, having been 
calculated from (38). Using these tables first to obtain 
P,, and P,,, and then Fig. 1 to find the required input 
signal-to-noise ratio per pulse, it was possible to plot 
this signal-to-noise ratio as a function of n (1 <n < K) 


for various values of K, P,, and P,. 


The results are shown in Figs. 2 to 6 on the next page, 
for Kk 3, 5, 10, 30, and 49, respectively, and for the 
various indicated combinations of P,, and P,. The points 
indicated on the curves are at the values of n for which the 
calculations were made. 

Since interpolation between the curves of Fig. 1 was 
necessary, signal-to-noise ratios could not be found too 
accurately, so that the results in Figs. 2 to 6 are accurate 
only to within a few tenths of a db, good enough, however, 
for the purposes of this analysis. This accounts for the 
apparently peculiar-looking shapes of some of the curves 
in these figures. 

The regions of optimum 7 are clearly shown, however, 
and are seen to be fairly broad for K large enough. 

With K fixed and greater than 10, the region of optimum 
n seems to shift slowly toward smaller values of n as P, 
decreases or P,, increases. Because of the scattering of the 
points plotted, however, it is not possible to positively 
verify this conclusion. If such a shift does occur it is small 
enough so that for 10°. < P, < 10° and 50 per cent 
< P, < 90 per cent the broad region of optimum n can 
be considered to be roughly independent of P,, and P,, 
(for a particular K). 

The trend of decreasing optimum n with increasmg K 
is definitely noticeable on the other hand. The exact 
dependence of optimum n on K has not been determined 
analytically, but judging from the graphs the optimum 
value of n seems to be roughly proportional to AEG: 

From the results for the values of K plotted, the region 
of optimum n (minimum input. signal-to-noise ratio 
within 0.2 db) may be written empirically with good 
enough accuracy as 


5 National Bureau of Standards, “Tables of the binomial prob- 
ability distribution,’ Appl. Math. Series, no. 6; January, 1950. 
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Qn i = WN hes 


This.s valid for710-"<sP,, < 10° 950 pencenta-ayee= 
90 per cent. 


ReEsuuts or ANALYSIS 


Using the optimum values of 7 the results of the co- 
incidence detection procedure have been plotted in Figs. 
7 and 8 on the next page, as the curves labeled “optimum 
coincidence.”’ (These curves thus give the minimum 
signal-to-noise ratio required for different values of K, P,, 
and P,. The minimum signal-to-noise ratios have been 
obtained from Figs. 2 to 6.) 

These results are there compared with results obtained 
for the Neyman-Pearson threshold integration procedure’ 
(“video integration’’), in which P,,, and P,,,, have been 
set equal to P, and P,, respectively. 

Also plotted are curves for the particular case of total 
or complete coincidence, all K pulses being required to 
exceed the threshold level for detection; ‘‘linear inte- 
gration’ curves (an ideal situation corresponding to 
coherent detection); and curves for a system integrating 
only as efficiently as ~/K. (Here the required signal-to- 
noise ratio decreases as the square-root of the samples 
added. This has been found experimentally to be the 
performance capability of a human operator detecting 
signals on a screen.) 

Figs. 7 and 8 indicate that the “optimum coincidence” 
curves are very nearly parallel to the Neyman-Pearson 
video integration curves (at least up to K = 49, the 
largest value of K taken), requiring 1.3 to 1.5 db more 
power for the same system performance with K > 10, 
the additional power needed decreasing with smaller 
values of K. (All threshold detection curves of course 
coincide at K = 1). This coincidence “loss” of about 1.4 
db over video integration is very nearly the same for the 


two cases plotted, P, = 90 per cent and 50 per cent 
(P,, = 10 °°), and can be shown to be about the same 
value for P, = 10°’. : 


These results mean that the coincidence procedure 
might in some cases be preferred over the integration 
procedure if the circuitry needed were less complex. 

It is noteworthy to remark that although the optimum 
coincidence procedure is 1.4 db worse than the integration 
method, it is still much better in performance than a 
system integrating only as efficiently as Nhe 

For K small enough (less than about 30 for the values 
of P,, and P, in Figs. 6 and 7) the complete coincidence 
procedure is also more efficient than the /K integration. 

It is well to remember, however, that the above con- 
clusions as to the comparative efficiency of the two detec- 
tion schemes considered are strictly correct only under the 
given assumption of equal-amplitude non-fluctuating 
signal echoes. 

Although the effect of signal fluctuation on the 
Neyman-Pearson detection procedure has been analyzed," 
similar calculations have not been carried out for the 
coincidence method. 
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Some General Aspects of the Sampling Theorem’ 


D. L. JAGERMANT AND L. J. FOGELT 


Summary—The sampling theorem is recognized as an in- 
terpolation formula. Starting from the Lagrange Polynomial, this 
theorem is developed under conditions which are of broader 
applicability than those usually stated. Such a view point indicates 
the essential unity of temporal and frequency domain application. 
It will also be shown that the theorem is applicable as an exact 
interpolation formula throughout the complex plane. The basic 
theorem is extended to include sampling of the first derivative of 
the function. The concept of band-limited functions is introduced 
through use of Fourier-Stieltjes representations. This is then 
shown to be subsumed under the general class of functions which 
is considered in connection with the interpolation theorems de- 
veloped. This approach, as presented, readily leads to the establish- 
ment of many sampling theorems. It is hoped that this paper will 
aid the formulation of particularly applicable sampling theorems 
for specific problems. 


* Manuscript received by the PGIT, May 31, 1956. 
+ Stavid Engineering, Inc., Plainfield, ING OR 
t Convair, San Diego, Calif. 


INTRODUCTION 


N SOME communication systems the discrete time 
[ samples of data which are received concerning the 

original signal must be smoothed so as to yield a 
continuous replica of the original signal. The choice of 
the smoothing process is normally dependent upon the 
amount of reliance which is placed on the sample infor- 
mation. For example, it may prove of value to average 
each received pulse of the signal so as to obtain a single 
sample value, then accomplish smoothing by application 
of the sampling theorem, that is, the use of the convergent 
cardinal series which yields all intermediate values of the 
signal. This paper will indicate the relation this smoothing 
process has with respect to interpolation theory and in 
this manner emphasize the required conditions which 
must be fulfilled for valid application of the theorem. 
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Practical application of the extended sampling theorem 
in the complex domain may serve to facilitate complex 
spectral analysis based upon the knowledge of system 
operation at some discrete points. It appears necessary 
to recognize some of the fundamental relationships which 
underlie the derivation of this theorem so as to permit 
proper choice of interpolation technique for each specific 
problem. 


Discusston 


Let the available discrete data be represented by 
f (to), ft), f(t), f (és), . f(t), and for simplicity of 


notation, 


Ft.) = J (1) 


It is often desired to construct a continuous interpolate 
from such discrete data which may be obtained as samples 
of an original continuous signal. If reliance is placed on 
the correctness of the discrete information, then it becomes 
reasonable to require the interpolate to pass through 
the sample values. (Alternate smoothing processes could 
also be chosen so as to minimize the mean square error, 
to minimize the maximum error, or to improve some 
other imposed metric or system characteristic. The 
following discussion will be restricted to errorless inter- 
polation at the sample points achieved by selection of a 
proper polynomial.) 

A polynomial of the nth degree is exactly specified by 
n + 1 points and has n zeros (including multiplicity of 
zeros). Lagrange utilized a general interpolation poly- 
nomial’ of the form 


eA) a foLo(t) a5 nee oA) oe ne ae FL) (2) 
where L‘(¢) is the Lagrangean Coefficient defined by 
L(t) = Sy) bah) 2 tes) te a) ie (CT) 
(tj SCare et) et) ae Ct) : 
(3) 
Note that this cofactor has the following ‘‘selective’’ 
property; 
LG) = 0, a) 
cone ome Oren) 


thus each sample point in time will cause all but one of 
the polynomial terms in (2) to vanish; the remaining 
term will have a coefficient of unity. In this manner the 
entire polynomial will agree with the sample data at each 
of the sample points. Such as interpolation polynomial 
is clearly unique so that any other interpolation poly- 
nomial which yields f; at ¢ = #; forO < 7 < n can be 
exhibited in the form of (2). It may be shown’ that a 
scale change applied to the time axis will not affect the 
Lagrangean Coefficients when the sample points are 


This was first due to Euler who, in a tract entitled ‘““De Eximio 
Usu Methodi Interpolationum in Serierum Doctrina,’’ had long 
before obtained a closely analogous expression. 

2 See the appendix. 
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equidistant. Fig. 1 illustrates the nature of a typical 
Lagrangean Coefficient as a function of the continuous 
argument. 


L3@ 


LSa)= (t+2) ee (t-1) (t-2) 


=4- 1-3 72 eee io} | 2 3 4 


Fig. 1—The Lagrangean coefflcient Lo’. 


It is of interest to compute the error of interpolation. 
This may be found from the relation 


ipo 
(n+ 1)! 


where 7 is a particular time which lies somewhere between 
the least and the greatest ¢;.° By the above remark con- 
cerning the uniqueness of polynomials this equation also 
represents the error obtained in interpolation through a 
large number of standard interpolation techniques, in- 
cluding the Newton-Gauss and the Newton-Gregory 
procedures. 
Let g,(t) be defined by 


then the Lagrangean Coefficient may be expressed in 
the following compact form 


#Q — Pa) = (¢-t)¢-4)---G-4) © 


PRU 
C— ts) gn(t) 


and the Lagrangean interpolation polynomial (2) becomes 


0 Deal 


which is in the form of a i fraction expansion. 
Nothing has been said to this point which necessitates 
consideration of only a real variable argument; therefore, 
let ¢ be replaced by the complex variable z. Dividing by 
Jn(2z) yields 


Pe) yes fo Ae He _ ee 
7.2) 5 @ieeealg Cunning ne® 


L(t) = (7) 


Pak t) = (8) 


(9) 


Examination of the right hand expression shows that it is 
meromorphic with a simple pole at each sample point. 
The polynomial P,,(z) is an entire function; however, the 
fraction P,,(z)/g,(z) 1s meromorphic to agree with the 
right-hand side of (9) as is seen from the nature of g,,(z). 


’ The time 7 may be eliminated from explicit consideration by 
obtaining an estimate M satisfying | f"*(r) | < M, least t; <7 < 
greatest ¢; then (5) may be written 


[$@ — PO | S pI - WU) =H) I. 
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The function, g,(z), is entire, but vanishes at the sample 
points. The fraction may be represented by 


P.2). 
Gn(2) 

This leads to a generalization where the number of 
sample points becomes infinite. A suitable choice of 


entire function which vanishes at a denumerable infinity 
of equidistant points is 


Ce)E= (10) 


Tv 


g(2) ="sin 7 (11) 


where g(z) replaces g,,(z). The zeros of this function occur at 


(12) 


and so, form a set of equidistant points which lie on a 
straight line with the central sample point placed at the 
origin. The right-hand side of (8) rewritten in terms of 
g(z) is 


2 = jlt, J = ie oy Zee LO se tel eat, 


sin 52 S Sal ve: 


PEI @j 


(13) 


which is recognized to be the cardinal series. This straight 
line may have any slope as determined by.the argument 
of the sampling distance h; h being, in general, complex. 
This paper will be limited to considering only samples 
determined by a fixed complex h; thus, sample points 
which lie on curved lines are excluded. 

The form of (9) suggests the use of contour integration 
in that it characteristically resembles the sum of residue 
terms. Accordingly, under the g(z) chosen above, consider 


1 sAC®) 
Qrt Jo, (€ — z) sin (r/h)E df, 


where the integration is carried out in the complex 
¢-plane (¢ = & + 7m in which é and 7 are real) and z ¥ gh. 
The sequence of paths, C,, may consist of any closed 
contours which, successively enclose additional pairs of 
singularities corresponding to sample points. Since 
these sample points are restricted to lie upon a straight 
line, it may prove expeditious to chose C,, to be a square as 
in Fig. 2 

It is convenient to introduce the function A(y) defined 
by 


[Oe (14) 


| flee"*) | (15) 


A(y) = max 
—aM<zr<O 


for any entire function f(z) and given constant whenever 
the maximum value exists. The following theorem will 
now be established. 
Theorem 1 

The entire function f(z) is given by 
sin (1/h)(z — jh) 

(x/h)(z — gh)” 


in which the cardinal series is uniformly convergent in 
any finite closed domain of the z-plane, ff; fh), 


n= Sov 


j=—0 
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= | h | e'®, if there is a constant K so that 


4 


Ce aG K ae ; for | y | reece 


C-PLANE $7 
C, 


[@+)a)] C140) 


-(nt5)h 


Fig. 2—Contour integration in the ¢ plane. 
To establish the theorem, it is convenient to rotate the 
path C,,. Let 
agen (16) 
then J,,, (14), may be written as 


ere / f(ze'*) 
I, = 3 : 
2nt Jdn (re * 


— zg) sin (r/|h |)r 
The new path of integration, D,,, is shown, on the 7 plane, 
in Fig. 3. This corresponds to a rotation of the sample 
points which now occur on the real 7 axis at the points 


(17) 


jl hl. 


7-PLANE (7 


[onsgaic'+ 6] 


Fig. 3—Contour integration in the 7 plane. 


Denote the integral in (17) taken over the upper 
horizontal line by J,,,, then 


(2n + 1) | A] ns | 
2a | (re 


f(ze'*) 
— z)sin (r/|h |)r 


(18) 


[Wee | cS 


4This notation means | y | approaches infinity. 
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in which the maximum is taken on the line. It is assumed 
that z is some point interior to D,. Since 


| re"? —z| >m+4) [A] —- |Z, (19) 
it follows that, 
, (2n + 1) |h | oe tea al 
tea S Sn + Dh] — 212 le [sin G/T A Dr | 
(20) 


also there is, clearly, a constant K,, independent of n, 


so that 
(2n + 1) |h | - 
<q F 
ae Cy PSE ey 
Hence 
f(re**) | 
Leesan as. ue . 22 
ea ae semen ae | a 
An upper bound on 
iat 
| sin (a/| h |)r | 
may be obtained as follows: 
1 a 21 
| sin (x/| h |)r | = eiir/lblys pia /Ihi)r 
< 2 4 —(r/|h\)T2 (23) 
Tg ike IT oy G 
in which D,, has been chosen so that 
iuie= en See lAire Ss 4, (24) 
Thus, using (23) and the premise of the theorem, 
eee | < Ae A aa) 
< AK (25) 


| 72 | 


Denote the integral along the lower horizontal line by 
I,,,3, then, in the same manner as above, 


AK, KS 


an 


Le eee Ae Ale es (26) 
The bounds given in (25) and (26) are uniform in 7. 

Denote the integral taken over the right-hand vertical 
line by J,,4, then 


i I I< TT a A(t») 
n,4 SS Qn ~ r TQ. 
m J—(n+is2)ini | te'® — z| | sin (w/| h |)z | 
(27) 
Since 
1 1 a 
ts a (r/|AL) 
wen aM ut cence lunes 8) 
it follows that, by the premise of the theorem, 
A(t») 2k 
[sin @/|h Dr] =] 721 ce 
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[re —2z| > Vee — lal. (30) 


Thus the integral in (27) converges as n — ©, and, hence, 


A ( T2) 


1 ao 
Ie ees SS 
| | T i (W373 qe T2 


~|l-ealyecost ni Aenea tal 
(31) 
Let 
Ji ihe os A(t) a 
0 (Wr? + 73 — |2 |) cosh (r/| h |) 72 
x if Sok Ee A(t») A. 
0 (Vr + 72 — |2)) cosh (x/| h |) 72 
A(t) dr 
+ | (Vr + 72 — ||) cosh (x/| h |r * 83 


in which 7 > O is arbitrary, then the integral over the 
range from 7 to infinity converges uniformly in 7,. It 
follows that 


a za) 
(agnereare (V7 4 ci oe | 2 |) cosh (x/| h |) 72 


dis ="0) 
(33) 


Denote the integral taken on the left-hand vertical 
line by J,,.., then, in a similar manner as above, one 
obtains the same bound for | J,,, | as that given for 
| Zn.4 | in (31). 

Since 


lie Pee Nios seas ie | cleo | 


an estimate can be obtained for | J, |. Choose 7 so that 
A(t) 
— |z |) cosh (x/| h |) 72 


4 A(12) 
<a es 
= I (Va? + 73 — | 2 |) cosh (/| h |) 72 


in which 7, > a > | z| and e > 0 is arbitrarily assigned. 
Choose D,, so that 


(34) 


drs 


| a 


hits & oS ; (35) 


| naalee Ge 
Lowe (37) 
a A(t») E 
— 2 ES RP? 38 
ere pepavenerimes 0 
then 

| Dele (39) 

and therefore 
lim I, = 0. (40) 


no 


Evaluation of J,, (13), by residues will now yield the 
required result. The singularity at ¢ = z has the residue 


| 


1956 


Ree 


sin (r/h)z (41) 


_ The singularities introduced by sin (r/h)¢ are all simple 


poles. To evaluate the residue at ¢ = jh, let 


| a Geese (42) 
then 
ACS) TW a gh) 
(¢ — z)sin(r/h)e (oD) (W + jh — 2) sin (4/h)W eo) 


| The coefficient of 1/W in the Laurent expansion about 


W = 0 is the required residue. Since 
SAUNT 10 Pally Carel Uh hy etna (44) 
1 1 WwW 
ee esa. wes 
1 1 1a 
sin (r/h)W G@/hW 6 (Fw Pa ae ae 
the residue is readily found to be 
oo. (47) 
Hence 
ee ee a = 
sin (/h)z 1; al & a = 2 ys 


and the theorem follows. 

The form of the cardinal series is that of a convolution 
sin (/h)z 
a/h)z 
h to be positive, it is of interest to consider the Fourier 
transform or frequency spectrum of this kernel. One has 


p[ eels] =f 


with kernel - Restricting z to the real axis and 


~ive Sin (x/h)x 


(w/h)x 
iii lw |< ; 
= ais teal es (49) 


Reference is made to Fig. 4. The frequency spectrum is 


Fig. 4—Cardinal series kernel and its transform. 


that of the ideal lowpass filter; however, the kernel is 
employed in a convolution sum rather than integral. Use 
of a low-pass filter, therefore, requires modification of 
interpretation. Let a sequence of equidistant, narrow 
pulses be used of repetition rate 1/h, width 7, and height 
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f:, then the output of a low-pass filter with this sequence 
as input closely approximates’ 


sin (x/h)(z — gh) 
(x/h)(@ — gh) 
Thus, if f(z) satisfies the conditions of theorem 1, the 


output of the filter closely approximates ;f(z). 
Interpolation using f; and f/, that is 


df(2) 


d 
de ean 


(50) 


may be readily accomplished using the guiding principle 
introduced above. The crux of the matter lies in the 
proper choice of entire function g(z). The zeros of g(z) 
must be as before, ¢.e., at 2 = jh, however, in order to 
introduce f/ every zero should be double. Thus a proper 
choice is 


g(z) = sin” ; 2 (51) 
and the appropriate integral to consider is 
tls ss 
: 109) ds, (52) 


cn (§ — 2) sin’ (w/h)¢ 


in which the paths C,, are the same as given in Fig. 2. 
This leads to the following theorem. 


SOF 


Theorem 2 


The entire function f(z) is given by 


— & [sin @/De - et | jae ats 


in which the series is uniformly convergent in any finite 
closed domain of the z-plane, f; =f (gh), 


df(z) i ig 
fo ~e Se = h , 
f= 2), ha |hle 
if there is a constant K so that 
Cae Aas) < oa for | y | econ 


Since the proof of this theorem follows the same lines 
as for theorem 1, it will be omitted. 
The interpolation series consists of two convolution 
sums whose kernels are 
(ein Ey 
(x /h)z 


used with the sample values f;, and 


oo) 


used with f%. Graphs of these kernels and their trans- 
forms are presented in Figs. 5 and 6. 


’ The approximation is improved when 7 is decreased. 
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Fig. 6—The kernel x(sin(#/h)x/(r/hx)? and its transform. 


The realization of this interpolation series is accomp- 
lished by using two sequences of pulses whose heights are 
jf; and f; with common base width 7. As in the case of the 
cardinal series, when integral convolutions are used the 
output will closely approximate 7f(z) if f(z) satisfies the 
conditions of theorem 2. 

A class of functions which is of particular interest is 
the class of band-limited functions defined as follows: A 
function f(z) is said to be band-limited if there exists a 
constant W > 0 and a function g‘”’ of bounded variation 
over the interval (—2rW, 27W) so that 


1 27 W 


f@ = e'°* dg(w). 


Qn J_orw 


(53) 


The function g(w) is called the Fourier Stieltjes spectrum 
of f(z), and W the maximum frequency. In case g(w) is 
absolutely continuous over (—27W, 27W), then (53) 
may be written 
1 21 W : 

f@ =a. f e*'@) de (54) 
and g’(w) is called simply the Fourier spectrum of f(z). 
Clearly a band-limited function is entire. 

From (53) 


| | 1 a —wy Ti edlt 
Ol<5-f eave) (55) 


in which V(w) is the variation of g(w) over (—27W, a). 
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Let 
(56) 


QrW 
Vv =| avin 
—-27W 


that is, V is the total variation of g(w) over (—27W, 21W), 
then, from (55), 


| f(2) | < em (57) 


Thus one may define the function A(y) by 


is 27 y 

A(y) = gen (58) 

Consider the product 

hl) V m(2W-1/\k\|) ly 

e FOR “A(y) at ae / ; (59) 

clearly the inequality 
—(r/\hi)lyl , Is 0 
e AGS area: ples C2 (60) 

is satisfied if 

oW — aa <0. (61) 


Hence, referring to theorem 1, the following theorem may 
be enunciated. 


Theorem 3 

A band-limited function f(z) with maximum frequency 
W is representable by 
sin (r/h)(z — gh) 

(x/h)(@ — gh) 


provided the sampling interval h satisfies 


j= Dh 


1 
[hl < op (62) 
Let zf(z) be band-limited and lim... 2f(z) = 0,-then, 


f(z) is entire and, from (57), 


V eee 


(63) 


hence theorem 4. 


Theorem 4 


If the function zf(z) is band-limited with maximum 
frequency W and lim,.., z2f(z) = 0, then f(z) is repre- 
sentable by the cardinal series provided | h | < 1/2W. 

Let f(z) be band-limited over the interval (—27W, 
2xW) with a Fourier spectrum g(w )of bounded variation, 
then 


f= 5] eal) de. (64) 
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Define g(w) to be zero outside the interval (—27W, 
27rW), then 


f@ =5- | €'**Gle) de. (65) 
Integration by parts shows that 
1 el2ntWw ; 
f@ = —95 ied e”* dg), (66) 


that is, zf(z) is band-limited and from (64) lim,.. 2f(z) = 0. 
_ Hence theorem 5. 


Theorem 5 


If f(z) is band-limited with maximum frequency W 
and Fourier spectrum of bounded variation over the 
interval (—27rW, 27W), then f(z) is representable by the 
cardinal series provided |h| < 1/2W. 

Representation of f(z) by means of the interpolation 
series of theorem 2 is immediately obtained from the 
above analysis. Theorems 6, 7, and 8 follow. 


Theorem 6 


A band-limited function f(z) with maximum frequency 
W is representable by 


[3 (x/h)(z — jh) 
(r/h)(z — gh) 


provided | h| < 1/W. 


fe 


7=-@ 


Tis, + @ - or 


Theorem 7 


If the function zf(z) is band-limited with maximum 
frequency W and lim,., zf(z) = 0, then f(z) is repre- 
sentable by the interpolation series of theorem 2 provided 


Al <a 


Theorem 8 


If f(z) is band-limited with maximum frequency W and 
Fourier spectrum of bounded variation over the interval 
(—2aW, 27rW), then f(z) is representable by the inter- 
polation series of theorem 2 provided |h| < 1/W. 

Theorem 2 and the last three theorems indicate that 
the use of both f; and f/ allows exact representation 
using a sample interval twice that required for repre- 
sentation of the same function when using f; alone. The 
above presents a procedure which may be used to develop 
interpolation series applicable to the general case where 
fi, fi, fv’, +++ f° values are available. This case has 
been considered in a previous paper [9], where it was 
indicated that, if n derivatives are available as discrete 
data, then exact representation can be obtained provided 
that 


Mate Le 


2W () 


[A|< 
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It is interesting to note that this result agrees with the 
representation in 2WT dimensional signal-space, where 
T is the time interval duration outside of which the 
sample values, f;, are small. It appears that the number 
of independent samples or orthogonal dimensions can be 
arbitrarily chosen from among the continuous function 
sample values or its derivatives. 


CONCLUSION 


Several applications of this interpolation analysis 
appear possible in connection with engineering systems; 
for example, ordinary pulse radar wherein the slant range 
is determined by the return-time interval has implicit 


-range rate information contained in the Doppler modu- 


lation of the pulse carrier frequency. Thus, if slant range 
is the desired continuous variable, then the available 
discrete data may be used to establish an approximation 
of the function and its first derivative. 

Obvious additional applications occur in the field of 
air traffic control wherein the aircraft estimated velocity 
as well as position is used upon which to base a continuous 
course plot of the air-path. 

The human operator in vehicular control normally 
uses an interpolation technique wherein the input. data is 
the function and its first derivative at discrete points in 
time. This is due to the fact that the operator observes 
each display at discrete times and almost always per- 
ceives rate as well as position of a pointer. It then becomes 
immediately evident that almost all information possible 
about the amplitude variations corresponding to the 
pointer position (relating to some parameter) can be 
discerned by observation of the display at times almost 
half as often as when the pointer position alone is observed. 
It is not intended to reify this analysis. However, certain 
aspects of human engineering may now become more 
apparent and allow for numerical approximation through 
the application of the above stated theorems. 


APPENDIX 


Let the sample points be equidistant then the following 
scale change may be made. 


(68) 
(69) 


b= hw, 


t; = hw;. 
Hence, by substitution, 
Lj(w) 
(hw —hw»)-- -(hw—hw;,_1)(hw—hw,,,)---(hw—hw,) 
~ (hw; — hwo) ++ (hw; —hw,_1)(hw; — hw; 41) > (hw; —hw,) 


[((w— wo): (w—w;-1)(W— Wj 41) (ww, ]R 


[(w; — wo) -  - (Ww; — W;—-1) (W; — Wj) (Wi —w,) JA (70) 


Thus, equidistant data may be shifted to the integral 
time points without loss of generality and no effect on 
the Lagrangean Coefficients. 
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The Axis-Crossing Intervals of Random Functions” 


J. A. McFADDEN{ 


Summary—For an arbitrary random process £(t) there exists 
a function x(t) which may be obtained by infinite clipping. The 
axis crossings of x(f) are identical with those of £(t). This paper 
relates the probability density P(r) of axis-crossing intervals to 
r(7), the autocorrelation function of x(t), i.e., the autocorrelation 
after clipping. It is shown that the expected number of zeros per 
unit time is proportional to r’(0+), i.e., the right-hand derivative 
of r(7) at sz = O. Next a theorem is proved, stating that P(r) = 0 
over a finite range 0 < 7 < T if and only if r(7) is linear in | 7 | 
over the corresponding range of | 7+ |. If r(r) is nearly linear for 
small 7, then the initial behavior of P(r) is like r’’(r). These results 
are illustrated by some simple, random square-wave models and 
by a comparison with Rice’s results for Gaussian noise. 


INTRODUCTION 


NE OF THE outstanding unsolved problems in 
() the mathematical theory of noise is the de- 

termination of the distribution of intervals 
between axis crossings. For Gaussian noise, Rice’ has 
given an approximate result and has discussed the diffi- 
culties which impede a more accurate solution. Little is 
known, however, about the interval distribution for 
non-Gaussian noise. The present paper provides a new 
approach to the problem, whereby some of Rice’s results 
may be generalized for noise with an arbitrary distribu- 
tion of instantaneous amplitudes. 

When a noise signal is passed through an _ infinite 
clipper, the axis crossings are invariant but all other 
information is lost. In this paper we attempt to relate the 
axis-crossing interval distribution to the properties of the 
signal after clipping. 

The signal after clipping is defined as follows: Let 
x(t) describe a random process which is both stationary 
and ergodic. z(t) may assume only the values +1, and 


* Manuscript received by the PGIT, May 25, 1956. 

+ U. 8. Naval Ordnance Lab., Silver Spring, Md. 

1S. O. Rice, “Mathematical analysis of random noise,’’ Bell 
Sys. Tech. J., vol. 23, pp. 282-332; July, 1944; vol. 24, pp. 46-156; 
January, 1945; see sec. 3.4. 


either value is equally likely. Let the autocorrelation 
function for this process, 7.e., the expectation E[x() 
x(t + 7)], be denoted by r(r). 

We define the probability density P(r) for axis-crossing 
intervals in the following manner: Given a zero at time #, 
P(r)dr is the conditional probability that the next zero 
lies between ¢ + 7 and t + 7 + dr. The integral of P(r)dr 
from 0 to © is unity, and the mean axis-crossing interval 
is 


B= i) Kepcne (1) 


Let 8 be the expected number of zeros per unit time. 
Then E(r) is the reciprocal of 6, for in a very long time 
interval K there will be BK zeros and GK intervals. The 
mean interval is the sum of all the intervals, K, divided 
by the number of intervals, and the result is 1/8. 

The purpose of this paper is to relate the interval 
density P(r) to the correlation r(7) for an arbitrary 
function of the type x(¢). 


EXAMPLES OF 2(t) 


We shall cite four examples of the process x(t). These 
will be used later to check the general results. The first 
three are defined without reference to the clipping process. 


Periodic Square Wave 


The first example of x(¢) is the periodic square wave 
with unit amplitude, period 27, and random time origin, 
distributed uniformly between the values 0 and 27. 

The correlation function r(7) is a triangular wave, 


2 | 7 | 
Ap! ? 


rr + 27) = r(r). 


re), = L— at Meet pans, ot 


(2) 


| 
1956 


In this case the solution for P(r) is trivial, since all 
the axis-crossing intervals have a length 7. We have, 
sing the Dirac 6 function, 


| PG) = 


Random Sequence of Pulses’ 


6(7 — T). (3) 


The second example is a random sequence of pulses of 
duration 7. At an initial instant ¢, we toss a symmetric 
coin to determine whether z(t.) shall be +1 or —1. 
Whatever the result, x(¢) remains constant until ¢ = f) + hs 
then the coin is tossed again, ete., f) is a random variable, 
distributed uniformly peeeeen 0 ake YA 


_ The correlation function consists of a single triangular 
peak.” 


Es fe ae : 
r(r) 1 i Oe Neate (4) 


= 0, Wear hrrewr Ee. 


_ The solution for the axis-crossing interval density can 
be determined easily. The only intervals possible are 
multiples of 7. The probability that an interval is (n + 1)T 
is one-half the probability that it is n7’. It follows directly 
that 


teh % 2-"e(r — nT). (5) 


Markoff Process 


The third example of x(¢) is a Markoff process. In this 
process, the probability that z(¢ + dt) is +1 or —1 
depends only on the value of «(t). In other words, the 
probability that x(¢) changes sign between ¢ and ¢ + dt is 
independent of what happens outside this interval. Let 
this last probability be adf, where a is a constant. Then 
it is known® that the correlation function is 


aye er ealel, (6) 


Furthermore the number of zeros in a given time interval 


obeys the well-known Poisson distribution,’ and the 
probability density of intervals between successive 
ZerOS is* 

PG) = "ae. . (7) 


[This process should not be confused with the Gaussian 
Markoff (or Ornstein-Uhlenbeck) process, since x(t) is 
not Gaussian. If a Gaussian Markoff process is infinitely 
clipped, the output 2(¢) will not be Markoffian.] 


Clipped Gaussian Noise 


The fourth example of x(¢) is infinitely clipped Gaussian 
noise, Let £(¢) describe a Gaussian process, with normalized 


2 Tbid., sec. 2.7. 

H. M. James, N. B. Nichols, and R. 8. Phillips, ‘“Theory of 
Servomechanisms,”’ McGraw-Hill Book Co., Inc., New York, 
Xs “og 10s PAGE 1947, 

3 Rice, loc. cit., sec. 2.7. 

4 Rice, loc. cit, sec. 3.4, following eq. (3.4-11). 
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autocorrelation function p(r), and let x(t) be the output 
voltage after é(¢) is infinitely clipped. We have 


ai) = 1 if &) >0; 
= —1 if <@) <9. 


(8) 


Then the autocorrelation function of x(f), which we call 
r(r), 1s related to p(7) by the well known aresine law,’ 


l| 


we) = sin” Rey (9) 


If p(r) is regular at r = 0, then it can be expanded in a 
power series containing only the terms of even order in r, 
since p(r) is always an even function of 7. Then it follows 
from (9) that r(r) can be expanded in a series containing 
only odd-order terms in | 7 |, except for the constant 
term, which is unity. If p(7) is regular, then it is smooth 
at +r = O and has zero slope there, but r(r) displays a 
triangular peak with zero curvature in the neighborhood 
of r = 0. For small | 7 |, the derivative p’’(r) is large and 
negative, but r’’(r) (r # 0) is positive and small in 
magnitude compared to p’’(0). [An exceptional case 
occurs when &(¢) is nondifferentiable; 7.e., the derivative 
é’(t) has infinite power. Take, for example, the Gaussian 
Markoff (or Ornstein-Uhlenbeck) process, for which 
p(r) is an exponential function® of | 7 |. In this case p(r) 
is not regular, since it displays a triangular peak at 7 = 0. 
It follows from (9) that the corresponding r(r) possesses 
an infinite cusp.] 

For Gaussian noise the exact axis-crossing interval 
distribution has not been determined. The most important 
work has been that of Rice, who calculated the con- 
ditional probability that a downward crossing lies between 
t+ 7andt-+ 7+ dr, given an upward crossing at time ¢. 
Let us call this conditional probability Q(r)dr’. Since 
the zero in dr need not be the next zero after time f, 
Rice’s Q(r) will agree closely with P(r) only when 7 is 
small. However, for a narrow-band spectrum, the range 
of agreement includes most of the practical range of 
axis-crossing intervals. 


Exprectep NuMBER oF Zeros Per Unit Time 


We shall now return to the general process x(¢) which 
we defined earlier. 

If z, ="2@,) and a = x), the probability that 
z, = —l and 2, = +1 is (1 — ri), where r,, is the 
correlation coefficient between 2, and 2. Let us derive 
this result. We let the probability that x, and 2, are 
both equal to +1 be P,,, and similarly we define P__, 
P,_, and P_,. Then the moments H(1), E(@,), E(#e), 
and E(2,2,) are obtained as follows: 


5J. L. Lawson and G. E. Uhlenbeck, ‘Threshold Signals,” 
McGraw-Hill Book Co., Inc., New York, N. Y. yD» O95 1950) 

ed. ee DoOob edule, ‘Brownian movement and stochastic equa- 
tions,’ Ann. Math., vol. 43, pp. 351-369; April, 1942; see p. 353. 

7 Rice, loc. cit., eq. (38.4- iD. 
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EQ) =P, -F Pin Pia PSS 
Eia,) = Po Pl Pa ee (10) 
EG) Pee —ePa = Pye ee 
Koe,) = Pr Pos ee ee, 
Now #UL)*= 1, £@,) = E(@;) = 0; ands itaiws) eres 


therefore the solution of these four equations yields the 
result; P_, = 4(1.= ry)? Since Py =" “the prob- 
ability that 2, and x, are of opposite sign is $(1 — 742). 
Now let ¢; = tand?, = ¢ + At, where Aft is small enough 
that the probability of finding more than one zero between 
tand ¢ + At can be neglected. Then the probability that 
x(t,) and a(t) are of opposite sign becomes the expected 
number of zeros per unit time multiplied by At. Thus 


41 — r(Ad]. (11) 
1 + 7’(0 +) At; therefore 


BAt = 


For sufficiently small At, r(At) = 
(11) yields the result, 


pr OE). (12) 
r’(0 +) denotes the right-hand derivative of r(r) with 
respect to 7, evaluated at 7 = 0. We must distinguish 


between r’(0 +) and r’(0 —) because they are never equal. 
r(r) is an even function; therefore r’(r) is odd. Then 
r’(0O +) —r’(0 —), and consequently the two de- 
rivatives cannot be equal unless they are zero. They cannot 
be zero, for then (12) would imply that 6 = 0, 7.e., no 
axis crossings at all. 

E(r), the mean axis-crossing interval, is the reciprocal 
of this 6. 

Let us apply (12) to the examples given previously. 
For the periodic square wave, we may differentiate (2). 
We have 7r’(0 +) = —2/T; therefore 6 = 1/T and 
E(r) = T, obviously in agreement with the probability 
density (3). 

In the random sequence of pulses, (4) gives us 6 = 1/27, 
E(r) = 2T, and this result may be confirmed from the 
probability density (5). 

In the Markoff process, we find from (6) that 6 = a, 
in agreement with the definition of a, and that (7) = 1 be 

For clipped Gaussian noise, we differentiate (9) wi 
respect to 7; then by a limiting process we find that 
sree Olas, (13) 
provided p’(0) = 0. This is Rice’s result.* If €(¢) is non- 
differentiable, then p’(0) ¥ 0 and 8 is infinite [for example, 
if &(t) is the output of an RC low-pass filter and the input 
is broad-band noise—Rice* has discussed this example]. 


A THEOREM CONCERNING P(r) 


Let us now prove a general theorem which relates the 
axis-crossing interval density P(r) to the autocorrelation 
r(T). 


8 Rice, loc. cit., eq. (3.3-11). 
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Theorem 1 


P(r) = 0 over a finite range 0 < 7 < T, af and only a 
r(r) is linear in | r | over the range 0 <|7 | < T. In other 
words, if the autocorrelation of x(¢) is a linear function 
of | 7 | between 7 = 0 and|7| = T, then the probability 
of an axis-crossing interval shorter than 7 is zero, and 
conversely. 

Proof of sufficient condition: The probability that 
x(t,), x(t), and x(t,) are respectively +1, —1, and +1 is 


ee 4(1 = Poets Tae ae Toa), (14) 


where the r;; are the respective correlation coefficients. 
This result may be derived by a generalization of (10). 
We write the moments (1), E(a,), E(a2), H(x3), H(a%2), 
E(a,23), H(x.7;), and H(x,x-,x3) in terms of the probabilities 
Pie) P22, ete. Theme: sete) 1,) E(@;) 
E(x,;) = r;;, and-E(a,4.2,) = 0, and solve for P,_.. 

Now let?, = t,4, = ¢+7,, and#; = ¢ + 7. Then (14) 
becomes 


IEP soi $[1 a (74) a5 (72) 7 r(T2 aaa 71)]. (15) 


By the assumption in theorem 1, we may define r(z) 
as follows: 


ae (16) 


where a isa constant, 0 <a < 2/7. Under this assumption, 
(15) becomes identically zero for all values of 7, and 7, 
in the range 0 < + < T. By symmetry the probability 
P_.,_ is also zero. Then we conclude that the probability 
of finding more than one zero in an interval of length 7 
is zero, when + < 1’; therefore we have proved sufficiency 
in theorem 1. | 

A different procedure will be employed for the proof of 
the necessary condition. 

Proof of necessary condition: The autocorrelation r(r) 
may be written as a time average, 


1—a|rl, 0 Seal aie 


x(a(t + 7) dt, (19 


(7) = lim nef 
where L is a very long period of time. The derivative 
r’'(r) may be written as follows: 

(rt) = lim Tan x(é)a’(t + 7) dt. (18) 
Now «(t) is always +1 or —1; therefore x’(f) is a sequence 
of positive and negative spikes or 6 functions, 


x(t) = 2 D7 (—1)*a(t — 4), 


where ¢; are the successive zeros of x(t). The factor 2 
arises because x(¢) changes by +2 at each crossing. 

The integrand of (18) will be zero except when ¢ + 7 = ¢; 
Let 7 be in the range 0 <r < T. If¢, isan upward crossing 
then x(t; — 7) will be negative, provided there are nc 


(19) 


*For a generalization to n variables, see J. A. McFadden 
“Urn models of correlation and a comparison with the multivariat 
normal integral,” Ann. Math. Statist., vol. 26, pp. 478-489; Sep 
tember, 1955, sec. 6. 


1956 


axis-crossing intervals shorter than 7. This last condition 
is fulfilled by the assumption in theorem 1. On the other 
hand, if ¢; is a downward crossing, then x(f; — 7) will be 
positive. In either case, the product a(t; — 7)a’(¢;) will 
be negative. For any long interval (—Z, L) the integral 
will be twice the number of zeros in the interval, with a 
negative sign. The number of zeros is 2. times the ex- 
pected number per unit time, or 28L; therefore (18) 
yields 


| i (rn) = = 28: Oe a eg he (20) 
By a similar argument for negative 7, we have 
—T<7r< 0. (21) 


| G28, 
At the point 7 = 0, r’(7) is discontinuous. But r’(0) is 
bounded, since it must le between —28 and +28. Then, 
by integration of the above equations, together with the 


initial condition on the correlation function, r(0) = 1 
we find that 


rr) = 1 — 28 | 7, 


dy 


Or ae eT (22) 


Since @ is a constant, we have proved the second half of 
theorem 1. [Note: If we let 7 — 0 from the positive side 
in (20), the result checks (12).] 

Let us examine the four previous examples of x(¢). 
The periodic square wave with period 27 and the random 
sequence of pulses of width 7 are in accord with theorem 
1, and the definitions of 7 are consistent. 

The Markoff process has no linear range for r(r); in 
fact, the second derivative r’’(7) is positive at r = 0 + 


and at 7 = O —. There is no finite range 0 < 7+ < T for 
which P(r) = 0; even for small 7, P(r) is not a small 
quantity. 


Clipped Gaussian noise has no finite linear range for 
r(r), but if p(r) is regular the curvature of r(r) will be 
small for small | 7 | > 0, since r’(0 +) = r’’"(0 —) = O. 
Also, from Rice’s work’ we know that P(r) is small for 
small 7. Thus our theorem is applicable only in a quali- 
tative, intuitive sense. If the Gaussian noise has an 
extremely narrow bandwidth, then p(r) is nearly sinusoidal 
and r(r) is nearly the triangular wave (2). As the band- 
width (before clipping) approaches zero, clipped Gaussian 
noise approaches the periodic square wave, and the 
conditions in our theorem become more nearly applicable. 


DERIVATION OF P(r) FOR SMALL 7 


We are now ready to derive an approximation for the 
axis-crossing interval density P(r) when the intervals are 
small, for an arbitrary random function of the type «x(é). 

We have already seen that the existence of a nonzero 
P(r) is somehow associated with the curvature of the 
correlation function r(r). We shall now derive the result 
that, for small 7, P(r) is directly proportional to r’’(r). 

Assumption: We assume that there exists a quantity T, 
such that the probability of more than two zeros occurring in 
an interval (t, t + T,) ts so small that it may be neglected. 
This condition can usually be fulfilled by taking 7’, small 
enough; the choice of 7, depends on the particular model 


McFadden: Axis-Crossing Intervals of Random Functions 


149 


and on the accuracy desired. For example, if most of the 
axis-crossing intervals lie near the mean interval E(r), 
then we may choose 7’; slightly smaller than 22(r), since 
the occurence of two successive axis-crossing intervals 
having a sum less than this 7, will be rare. 

Under this assumption we may proceed as in the proof 
of the second half of theorem 1. We shall apply (18) when 
7 is in the range 0 < + < T,. Suppose ¢; is an upward 
crossing in a(t), and let wu; be the preceding axis-crossing 
interval; 2.e., u; = ¢#, — t,-,. Then z(¢; — 1) will be 
negative if 7 < u; and positive if 7 > w,. ft; — 7 cannot be 
located before the next preceding zero f;_, because 
t < T, and because of our basic assumption. Now the 
probability that vw; > 7 or u; < 7 is 


if P(u) du or if P(u) du, 


respectively. Then from (18) we find 


i One —26 | P(u) du + 28 | P(u) du. (23) 
T 0 
But the integral from 0 to © is unity; therefore we have 


(7) = -28 +48 [ Pl) du.- (24) 
0 

Now let us solve for P(r). If r’’(r) exists, we may 
differentiate (24). Then if we eliminate 8 by (12), we obtain 
the result, 


Ga 


—2r'(O+) 


If 7’(r) is discontinuous at 7 = 7,, then the density P(7,) 
is a 6 function. We may calculate the probability that 
7 = 7, by evaluating (24) at 7, + 0 and at 7, — O, and 
then taking the difference. Then 


vt Ty es Fe a 
= 27 (0) 

Eqs. (25) and (26) are the desired results for the range 
0 < 7+ < T, and under the given assumption. Let us 
return to our examples of «(#). 

The periodic rectangular wave is trivial. The above 
formulas yield Pr {r = T} = 1 and P(r) = 0 elsewhere. 
This result agrees exactly with (3), since the assumption 
at the beginning of this section is correct when T, < 27. 

For the random sequence of pulses, we find Pr {r = 
T} = 1/2 and P(r) = O elsewhere in the range 0 < + < 27. 
This result agrees with (5), but we cannot obtain any 
further information about P(r). The basic assumption, 
not more than two zeros in an interval of width 7;, is 
fulfilled when 7, < 27 but not when T, > 2T. 


1G (25) 


Nee (26) 


71} 


The example in which x(#) is a Markoff process is not 
at all suited to the above theory. Since the number of 
zeros in a given interval obeys the Poisson distribution, 
there is no finite 7’, for which the basic assumption holds 
even approximately. 

For clipped Gaussian noise, we may substitute the 
aresine law (9) into (25). Then we have 
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where 8 is given by (13). This is identical with an ap- 
proximation given by Rice for the function we have 
called Q(7). 

Let us review briefly the method of Rice. Instead of 
calculating P(r), Rice calculated Q(r), which we have 
already defined. Q(7) is always a good approximation to 
P(r) if 7 is small enough. However, if &(¢) has a narrow- 
band spectrum, then most of the axis-crossing intervals 
7 are near the mean value H(r). For intervals from 0 up 
to a value slightly below 3(7), Q(r) will be very close to 
P(r), since for this range a downward crossing between 
t+ 7 and? +7 + dr is very likely the next crossing. In 
other words, Q(7) always gives the initial behavior of 
P(r), but in the narrow-band case the range of validity 
extends upward to include most of the practical range of 7. 

Rice’s expression’ for Q(r) is extremely complicated. 
For this reason, Rice specialized to the narrow-band case. 
For 7 in the neighborhood of the first maximum of Q(z), 
he obtained the approximation (27). He then simplified 
(27) even further by additional approximations. 

In deriving (27), we have not assumed a narrow band- 
width, but only that 7 < 7,. However, if the noise (before 
clipping) does have a narrow bandwidth, then 7’, can be 
chosen slightly smaller than 2/(7), since most of the axis- 
crossing intervals are near the mean. Then the range of 
validity of (27), 0 < 7 < T,, will include much of the 
practical range of 7. By our derivation, as contrasted with 
Rice’s, we conclude that the approximation (27) (in the 
narrow-band case) is valid not only near the peak of 
P(r) but also for all 7 to the left of it. 

In other words, a narrow bandwidth is not essential to 
the derivation of (27) for small r, but it is essential if (27) 
is to provide a practical approximation over the important 
range of 7. For example, for the ideal low-pass filter as 
plotted by Rice,’ (27) falls about five per cent below 
Rice’s Q(r) at the first maximum of Q(r) but fits Q(7) 
increasingly better as 7 becomes smaller. Q(r) itself is 
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reliable [as an approximation to P(r)] slightly beyond the 
peak and everywhere to the left. For 7 beyond the peak, 
(27) should not be used for the low-pass filter. 


DISCUSSION 


Let us return to the general problem concerning the 
random function x(t). The main purpose of this paper is 
not to rederive special results already known, but to 
discuss the relations between the axis-crossing interval 
density P(r) and the autocorrelation function r(r). Now 
that we have checked our general relations with existing 
results, let us consider a more general noise process, 
stationary and ergodic (but not necessarily Gaussian), 
described by the function é(¢). Let the output after infinite 
clipping be x(f), according to (8). The previous analysis 
indicates that P(r) can be related to the statistical 
properties of x(t) much more readily than to those of 
¢(¢). In general however, the statistical properties of «(2) 
[e.g., the aresine relation (9)] will not be known. 

What can we say about P(r) if we do know r(z), but 
have no further information? 

1) In any case, the mean value H(r) is the reciprogal 
of 8, where B is given by (12). 

2) If r(r) is nearly linear over a subset) range 
0 <7 < T, then the number of axis-crossing intervals in 
this range will be very small, and therefore we may choose 
T, = 2T. The basic assumption in the previous section is 
approximately fulfilled; therefore (25) provides an ap- 
proximation to P(r) in the range 0 <7 < T,. 

We have already mentioned the difficulties inherent in 
transforming the present theory into practical compu- 
tations. We feel, however, that the results of this paper 
provide a useful background for further research which 
is now in progress, and we hope that they may be of 
assistance to others as well. 
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Determination of Redundancies in a Set of Patterns* 


ARTHUR GLOVAZKYt 


Summary—A set of black-and-white patterns can be identified 
yy successive sampling of the individual ‘‘cells’? which constitute 
hese patterns. If the number of patterns is P and the number of 
ells is C, it is possible to find, for any specified sampling sequence 
it least C — P + 1 cells which may be omitted without obstructing 
inique identification. 

Such redundant cells can be found by two methods: Construc- 
ion of a ‘‘code mobile,’? and compilation of a ‘‘code schedule.” 
[The ‘‘mobile” is useful inasmuch as its topological characteristics 
an be correlated with the information capabilities and the inherent 
‘edundancy of the given identification process. The ‘‘schedule,”’ 
on the other hand, is the numerical means by which practical cases 
van be rapidly and systematically solved. 

Besides revealing the redundancies in a given set, both the 
‘mobile’ and the ‘‘schedule’’ may serve as useful tools in evaluat- 


ng and designing sampling programs. 
A paper, is any group of geometrical configurations 
which obey the following conditions: 1) The 
number of patterns P is finite; 2) all the patterns are of 
the ‘‘two-tone” (black-and-white) variety, and 3) any 
of the given patterns can be divided into a finite number 
of cells C, each of which is either wholly black or wholly 
white. 

If these restrictions are met, unique identification of 
any pattern in the given set can be accomplished by 
successive sampling of the C cells. Assuming any particular 
sampling sequence or scanning path, it is conceivable that 
identification can still be carried out even though some of 
the cells are by-passed. The objective of the following 
study is to devise a method by which the cells that are 
redundant with respect to the given identification process 
can be determined. 


INTRODUCTION 


“SET OF PATTERNS,” for the purpose of this 


THe CopE MosiLEe 


To gain a better insight into the flow and accumulation 
of information involved in pattern recognition, it is useful 
to have a close look at the so called code mobile. 

If the pattern set and the scanning path are known, 
one can readily construct such a ‘‘mobile” by sketching, 
for each pattern, a chain of “black” and “white” links 
which corresponds to the sequence of black and white 
cells in the scanning path. In drawing the chains it is 
convenient to divert the black links consistently to the 
left, and the white links consistently to the right. If all 
the chains are permitted to originate from the same point, 
some of the links corresponding to various chains will 
merge and the result would be a structure composed of a 
number of interconnected subchains. The points of inter- 
connection, the nodes of the mobile, are quite significant 
inasmuch as they are the mainsprings, so to speak, of the 
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whole structure. By construction, these nodes place in 
evidence the differences between the various patterns and 
indicate at which stages of the sampling process such 
differences reveal themselves. 

Fig. 1 presents, as a simple example, an arbitrary set of 
patterns with C = 9. Fig. 2 indicates, by assigning 
numbers to the various cells, a possible scanning path. 
The resulting mobile is shown in Fig. 3, where the numbers 
at the bottom and on the right refer to patterns and cells 
respectively. 


T L + 
LFe 


Fig. 1—An arbitrary set of 9 patterns. 
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Fig. 3—A code mobile pertinent to Figs. 1 and 2. 


It is seen that a code mobile, in our application, is 
composed of a number of levels which equals the number 
of cells C. By advancing from one level to the next we can 
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observe how the P given patterns are separated into 
smaller and smaller groups until, finally, each one of them 
is completely isolated. This process of successive separa- 
tion, placed in evidence by the nodes, is the essence of the 
identification process and the only useful operation as far 
as this process is concerned. It follows, therefore, that any 
level which does not contain any nodes is redundant and 
may be removed. As a verification it may be noticed that 
removal of such levels does not affect the original number 
of subchains and therefore the original number of distinct 
patterns which the mobile represents. In terms of the 
physical recognition mechanism, this result means that 
the cells which correspond to such redundant levels may 
be omitted from the scanning path without hindering 
the unique identification of all the given patterns. 

In the mobile of Fig. 3 the levels which do not contain 
any nodes are.3, 5, 6, 7, and 9. Fig. 4 shows the “‘reduced”’ 
mobile of Fig. 3 after the redundant levels have been 
removed. It is evident that the reduced mobile still 
represents 6 distinct patterns, although the number of 
sampled cells has been reduced from 9 to 4. 


| 


ie | he 


Fig. 4—The reduced mobile of Fig. 3. 
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THE CopE SCHEDULE 


Although the code mobile is very useful as a pictorial 
aid in understanding the principles involved in the 
recognition process, in most practical problems its con- 
struction may become quite cumbersome. A_ typical 
example is the determination of redundancies in the 


alpha-numeric characters where P = 35 and C = 100. 
The impracticability of the mobile in such a case is 
apparent. 


A considerable amount of compactness and facility is 
gained if the chains associated with the various patterns 
are expressed in binary digits, where “0” and ‘‘1”’ represent 
the white and the black cells respectively. The result is a 
series of P binary numbers, each containing C’ digits. 
Table I represents the “digitization” of Fig. 1 with respect 
to the scanning path of Fig. 2; the numbers on the left 
and at the top of the table refer to patterns and cells 
respectively. 

Let us now arrange the P numbers in the same sequence 
that their respective chains appear—from left to right— 
in the code mobile. It is observed that, thus arranged, 
every number will be numerically smaller than the 
preceding one. This follows from the fact that, in con- 
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structing the mobile, all the ‘1” links were continuousl 
driven to the left, while all the ‘0’ links were continuousl 
driven to the right; as a result, every chain represents 
number which is smaller than the number represented b: 
the chain to its left. 

Once the P numbers are arranged in a sequence o 
descending numerical value, they faithfully represent th 
corresponding mobile and may provide all the informatio 
previously derived from it. It is convenient to set thes 
numbers in a matrix-like array, called a code schedule 
This schedule would contain P rows and C columns an 
its mn’th element would describe the n’th cell of th 
m’th pattern. 

From previous discussion it is evident that if, by som 
method, we could locate the nodes of the mobile on th 
corresponding schedule, the redundant cells could im 
mediately be determined. If we now recall the propert; 
of nodes to effect successive separation, and note tha 
each separation occurs whenever a pair of black anc 
white links emerges from a subchain, we can readily 
formulate the required procedure. 

Proceeding from left to right, the initial step is to fin 
the first column in which 1 neighbors 0 and draw a separa 
tion line between the two digits which extends from th 
column in question to the last column. The arrays on botl 
sides of this line may now be regarded as new schedules ani 
the same step is applied to each one of them separately 
This procedure is repeated until all the columns ar 
exhausted or until each one of the resulting “‘subschedules’ 
contains only one row. The nodes of the mobile are noy 
represented on the schedule by the starting points of th 
vartous separation lines. The columns which do not contai 
any such starting points correspond, therefore, to th 
redundant cells. 

A better understanding of the procedure may perhap 
be achieved through a specific example. Arranging th 
rows of Table I in a sequence of descending numerica 


TABLE I 
Diairat REPRESENTATION PERTINENT TO Fics. 1 AND 2 
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value results in the code schedule of Table II. Column 
of the schedule is the first column to contain a 1-0 pai 
and therefore to initiate a separation line. From colum 
2 on, two separate schedules have to be considered. I 
column 2 one of these ‘‘subschedules” contains a 1-0 pai 
and a second separation line is initiated. From column 
on, therefore, three schedules have to be considered. I 
column 4 two of the three subschedules initiate separatio 
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, TABLE II 


Tue Copr ScHEDULE FoR TABLE I AND THE ASSOCIATED 
SEPARATION LINES 
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lines, while in columns 5, 6, and 7 no lines are initiated 
since none of the arrays in question contains a 1-0 pair. 
Row 8 initiates the fifth and the last line. It is now evident 
that columns 3, 5, 6, 7, and 9, and therefore the corre- 
sponding cells, are redundant. The ‘“‘reduced”’ schedule, 
composed of columns 1, 2, 4 and 8, is shown in Table III. 


TABLE III 
Tue REDUCED ScHEDULE OF TABLE IT 
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The validity of the reduction is verified by noticing that 
the rows in the reduced schedule are numerically distinct, 
and that if any of the remaining columns is removed, 
this distinctness is destroyed. 

As a summary, we shall repeat the steps necessary for 
the determination of redundant cells associated with any 
specified set and scanning path: 1) Digitize the patterns 
at the given sampling sequence; 2) form the code schedule 
by arranging the resulting numbers in a sequence of 
descending numerical value; 3) draw the separation lines, 
and 4) observe the columns which do not initiate any 
separation lines. The cells corresponding to these columns 
are the redundant cells. 


More Apout SCANNING PATHS 


It should be emphasized that the reduced number of 
cells, as determined above, is the minimum permissible 
number for the given scanning path. It is quite possible 
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that a different scanning path will result in a lower number, 
and therefore in a more economical identification process. 

Going back to the code mobile it may be noted that, 
starting with a single chain, one additional subchain is 
created by each one of the nodes. Since the total number 
of links in the final (lowest) level is always P, it follows 
that the number of nodes is in all cases P — 1. 

The way in which these P — 1 nodes are distributed in 
the various levels is a determining factor in the amount 
of redundancy inherent in the given set and scanning 
path. The longest possible scanning path occurs when the 
nodes are distributed in P — 1 different levels, and the 
number of cells required for identification is then P — 1. 
Consequently the number of redundant cells is never 
smaller than C — P + 1. In the best possible case the 
nodes are distributed in successive powers of 2, and the 
number of cells required for identification is then log, P. 
Certainly such a case is impossible unless log, P is an 
integer. 

Thus the node distribution, as determined via the code 
mobile or its equivalent schedule, not only reveals the 
redundant cells in any given case, but also enables us to 
estimate the efficiency associated with the specified 
scanning path and the advisability of imtroducing an 
alternative one. 


CoNCLUSION 


It has been found that the code mobile is an extremely 
useful instrument for achieving introspection as well as 
deriving general relationships pertinent to pattern recog- 
nition processes. 

The code schedule, the numerical counterpart of the 
mobile, proved to be a satisfactory tool for fast determi- 
nation of redundancies in practical examples. 

It is believed that both the ‘‘mobile” and the ‘“‘schedule”’ 
may be helpful not only in problems involving analysis 
of given recognition systems, but also in problems in- 
volving synthesis of such systems (e.g., finding the sampling 
sequence which yields the largest number of disposable 
cells). 

Finally it may be mentioned that the methods and 
conclusions presented in this paper are not restricted to 
patterns only, but may be directly extended to any class 
of “two-tone” storable messages. 
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Kent R. Johnson, author of the article “Optimum, 
Discrete Filtering of Signals Containing a Non- 
random Component,” which appeared on pages 49-55 of 
June, 1956 issue of IRE Transactions on INFORMATION 
Turory, has requested that the following corrections be 
made by the replacements listed below. 


Linear, 


“Predicted” with ‘‘predicated” in the second sentence 


of the Summary. 
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PGIT News 


(The following account of the 1956 Sym- 
posium on Information Theory is due to 
Prof. Peter Elias, chairman of the organizing 
committee, and does not note the tremendous 
effort expended by the organizing committee, 
particularly Prof. Elias, to whom much of 
the credit for the great success of this 
Symposium belongs. 

—The Editor) 


1956 Symposium ON INFORMATION 
THEORY 


The 1956 Symposium on Information 
Theory was held September 10-12 at the 
Kresge Auditorium of the Massachusetts 
Institute of Technology, Cambridge, Mass. 
Like the 1954 Symposium, it was organized 
by the Professional Group on Information 
Theory of the Institute of Radio Engineers 
and the Research Laboratory of Electronics 
of M.1I.T. The Office of Naval Research, the 
Air Research and Development Command, 
the Signal Corps Engineering Laboratories, 
and URSI were cosponsors. The meeting 
was attended by about 300 people. It was 
more international than the 1954 meeting. 
It included two French scientists, Prof. B. 
Mandelbrot, of the University of Geneva, 
who gave a paper on thermodynamics, and 
Dr. M. P. Schutzenberger, now a visiting 
research associate at M.I.T., who gave a 
paper on coding. Two German scientists, 
Dr. T. von Randow, who is a Common- 
wealth Fund Fellow visiting at M.I.T., 
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ing (32). 


(34). 


“CN® + 1)” with “(N* — 


Decembei 


“B” with “2” on the left side of (23). 
“N” with ““H” on the left side of (25). 
“Crammer’s’ with ‘‘Cramer’s’’ in the sentence follow. 


“ny + 1” with “p = 1” as the lower limit on the sum. 
mation in the (1, k + 1) element of the determinant ir 


1)” in both (37) and (38). 


The second minus sign on the right side of (45) with 


and equality sign. 


Place brackets around the right side of (28). 
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and Dr. H. Mittelstaedt, who is spending 
the year at Tufts University, attended. 
Three Russian scientists, Dr. B. V. 
Gnedenko, mathematician from the Math- 
ematical Institute in Kiev, Dr. V. I. 
Siforov, chairman of the Russian equivalent 
of IRE, and Dr. D. Panov, director of the 
Institute of International Information, 
Academy of Sciences, USSR, were also at 
the meeting. 

The nineteen papers scheduled for pre- 
sentation at the meeting were printed in 
the September issue of the PGIT Trans- 
Actions. In addition, a paper by Dr. 
Siforov arrived Friday, September 7, in 
the mail, and a paper by A. N. Kolmogorov, 
the most distinguished living Russian 
mathematician, arrived on Sunday, Sep- 
tember 9, with Dr. Gnedenko. These 
papers were translated and duplicated, 
and 300 copies were passed out to the 
Symposium participants on Tuesday morn- 
ing, September 11. This made some 
intelligent discussion possible when the 
papers were presented on Wednesday. The 
PGIT is deeply indebted to Morris D. Fried- 
man, who translated, and to Electronics 
Translations, who did the duplicating, for 
making this material available so rapidly. 
The translations are being published in 
this issue of the PGIT Transactions to 
complete the permanent record of the 
Symposium. 

Some of the events at the Symposium 
do not appear in the TRANsacTIONS. One 
of these was a warm welcome and a de- 


scription of the place of information theory 
at M.I.T., by Chancellor Stratton. A 
second was an informal report on Monday 
evening by Prof. J. McCarthy, of Dart- 
mouth College, Dr. M. Minsky, of Harvard, 
and Dr. R. J. Solomonoff, of the Technical 
Research Center, of the results of a group 
which met at Dartmouth this Summer to 
study problems of artificial intelligence. A 
third was Prof. Norbert Wiener’s after- 
dinner talk at the Tuesday banquet, on 
information theory and its relation to 
physics and the analysis of causality, which 
was enjoyed by an audience of 220. A 
fourth was Dr. Gnedenko’s brief speech 
replying to the comments on Kolmogorov’s 
paper. Gnedenko said that Russian math- 
ematicians, as well as engineers, felt that 
information theory was the most important 
scientific event of the past decade. 

The smooth running of the Symposium 
was due largely to its Treasurer, Ralph 
Sayers, who was in charge of arrangements, 
with the assistance of Mrs. N. Hidelberg, 
Miss D. Scalon, R. Keyes, and other 
members of the R.L.E. staff. The very 
considerable effort involved in making the 
Russian participation possible and providing 
translation service at the meeting was 
undertaken by the Hospitality Committee 
consisting of Dr. Paul Green, Prof. Y. 
Yngve and Prof. M. Halle, of M.I.T., 
and Prof. L. Dolansky, of Northeastern. 
Dr. E. C. Rowan, of the National Academy 
of Sciences, was also of great assistance in 
helping with foreign participation. 
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born on March 30, 1915 in Minneapolis, 
Minn. He attended Armour Institute, 

Chicago, Ll., from 
1932 until 1934, 
when he entered 


M.I.T., where he re- 
ceived the BS. 
degree in 1936 and 
the Ph.D. degree in 
1939, both in math- 
ematics. He was a 
part-time instructor 
at eMail edunimne: 
1936-1939. 

From 1939 to 
1948, he was a 
Proctor Fellow, then Fine Instructor, and 
later research associate at Princeton Uni- 
versity. He was on active duty with the 
Naval Reserves during World War II. 

Dr. McMillan joined the staff of Bell 
Telephone Labs., Inc., in 1946, as a re- 
search mathematician working principally 
with network theory and the statistical 
and mathematical theory of communi- 
cation. For the past year, he has been 
assistant director of systems engineering. 

He is a member of the American Math- 
ematical Society, Institute of Mathematical 
Statistics, Society for Industrial and 
Applied Mathematics, and the American 
Association for the Advancement of Science. 

Robert Price (8’48-A’54) was born in 
West Chester, Pa., on July 7, 1929. He 
received the A.B. degree in physics from 
Princeton Univer- 
sity in 1950, and 
the Se.D. degree in 
electrical engineering 
from the Massachu- 
setts Institute of 
Technology in 1953. 

While at Prince- 
ton, he participated 
in the 1949 cosmic 
ray expedition to the 
Central Pacifie and 
during the summers 
was employed in tele- 
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vision research at the Phileo Corporation. 
At M.1.T. he held a fellowship in the 
Research Laboratory of Electronics and 
later transferred to the Lincoln Laboratory 
where he carried out his doctoral thesis 
on communication via multipath. For a 
year he was engaged in radio-astronomy 
under a Fulbright award at the Common- 
wealth Scientific and Industrial Research 
Organization in Sydney, Australia. Since 
1954 he has been a staff member of the 
Lincoln Laboratory. 

Dr. Price is a member of Phi Beta Kappa, 
Sigma Xi, and the Franklin Institute. 
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For a photograph and biography of 
Mischa Schwartz, see p. 98 of the June, 
1956, issue of IRI TRANSACTIONS ON 
INFORMATION THEORY. 
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Claude E. Shannon (S ’36—M ’48— 
SM ’49—F ’50) was born in Gaylord, 
Mich., on April 30, 1916. He received the 
bachelor’s degree in 
electrical engineer- 
ing and mathematics 
from the University 
of Michigan. After 
four years of grad- 
uate study at M.LT., 
he was awarded the 
master’s degree in 
electrical engineer- 
ing and the Ph.D. 
degree in  math- 
ematics in 1940. 

During these years 
at M.I.T. Dr. Shannon served as research 
assistant in the electrical engineering de- 
partment and later as an assistant in the 
mathematics department. 

As a National Research Fellow Dr. 
Shannon studied at the Institute for Ad- 
vanced Study, Princeton, N.J., in 1940, 
and in 1941 he joined the staff of the Bell 
Telephone Laboratories, Murray Hill, N.J., 
as a research mathematician, and earlier 
this year, he was appointed visiting pro- 
fessor of electrical communications at 
M.I.T. 

Dr. Shannon has contributed to many 
fields of applied mathematics. At M.I.T., 
he was in charge of operating the Insti- 
tute’s differential analyzer. His work in 
mathematical theory has also led to the de- 
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velopment of maze-solving machines, an 
he has more recently undertaken sue 
probability problems as the design of re 
liable machines consisting of unreliabl 
components. 

Dr. Shannon’s work has been recog 
nized by the award of the Alfred Nobl 
Prize of the American Institute of Electr 
cal Engineers, the Morris Liebmann Awar 
of the IRE, and the Stuart Ballantin 
Medal of the Franklin Institute. In 1954 h 
was awarded the honorary degree of Maste 
of Science by Yale University. 

Dr. Shannon is a member of Sigma X 
Phi Kappa Phi, and the American Mathe 
matical Society. He is co-author of a boo 
on the mathematical theory of communica 
tion, the author of a number of technica 
papers, and the holder of a number « 
patents. 
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Vladimir I. Siforov was born May 31 
1904 in Moscow. He entered the Mechani 


cal-Electro-Engineering Institute ri 
Moscow and_ the 
attended the Electre 
Engineering Insti 
tute in Leningrac 


graduating in th 
radio faculty in 192¢ 

In 1927 he bega 
working in variou 
factories and _ re 
search institutes 
the radio field 
Beginning in 193 
he was lecturer o: 
the theory of alter 
nating currents and radio receivers in th 
Electro-Engineering Institute. 

He was awarded the degree of docto 
of technical sciences in 1937, and in 193! 
was awarded the degree of professor i 
the chair of radio receivers, a position h 
has held in several Leningrad universities 

Since 1953 Dr. Siforov has been the diree 
tor of the Research Institute of the Minis 
try of Communications, USSR. The sam 
year he was elected associate membe 
of the Academy of Sciences. 

Since 1954 he has been in charge o 
a laboratory in the Research Radio an 
Electronic Engineering Institute of th 
Soviet Academy of Sciences. He is chair 
man of the Central Board of the Scientifi 
Radio and Electrical Society. 
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